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PREFACE 



The purpose of this technical report is to document the procedures used to 
collect and process postsecondary school transcripts for a subsample of members 
of the younger (i.e., 1980 sophomore) cohort of High School and Beyond who 
attended postsecondary institutions at any time after leaving high school. The 
following outline provides a general guide to the contents of the report. 

■ Chapter 1 contains an introduction to the longitudinal studies program 
administered by the National Center for Education Statistics of the U.S. 
Department of Education; it also describes the scope of the transcript 
study. 

■ Chapter 2 summarizes the procedures used to collect transcript data from 
educational institutions. 



■ Chapter 3 describes the Computer -Assisted Data Entry (CADE) program with 
which transcripts were coded and converted to machine-readable form. 

■ Chapter 4 includes a discussion of data editing procedures. 

■ Chapter 5 describes the procedures used to construct sampling weights for 
use in computing population estimates . 



V 



6 



TABLE OF CONTENTS 

Page 

1. INTRODUCTION 1 

1.1 Overview 1 

1.1.1 The NCES Longitudinal Studies Program 1 

1.1.2 Relationships Between High School and Beyond 

and NLS-72 2 

1.2 History of High School and Beyond 4 

1.2.1 The Base Year Survey ; 4 

1.2.2 The First Follow-Up Survey 5 

1.2.3 The Second Follow-Up Survey 6 

1.2.4 The Third Follow-Up Survey .* 7 

1.3 Related Studies 7 

1.3.1 Other Base Year Files ... 7 

1.3.2 Other Special Studies FiXes , 8 

1.3.3 Merged Base Year and First Follow-Up Files 9 

1.4 Scope of the Postsecondary Education 

Transcript Studies 9 

2 . DATA COLLECTION 10 

2.1 Data Collection Objectives 11 

2.2 Mailout of Transcript Requests to Institutions 11 

2.3 Data Collection Results 13 

2.3.1 The School -Level Response Rate 13 

2.3.2 The Transcript-Level Response Rate..... 14 

2.3.3 Student-Level Data Collectioti Results.... 18 

3. DATA PREPARATION 19 

3.1 Data Preparation Objectives ... 19 

3.2 Data Organization ^ 20 

3.3 Computer-Assisted Data Entry (CADE) 21 

3.3.1 CADE Concept 21 

3.3.2 CADE Equipment: Hardware and Software 23 

3.3.3 CADE Operator Training 29 

3.4 Data Quality Management 29 



vii 

7 



TABLE OF CONTENTS (continued) 

Page 

4. DATA PROCESSING 30 

4.1 Machine Editing 31 

4.2 Organization and Content of the Data File 31 

4.2.1 The Student Record 32 

4.2.2 The Transcript Record 33 

4.2.3 The Term Record , 33 

4.2.4 The Course Record 34 

4.3 Merging Records 34 

4.4 The Cautionary Note on the Use of Credits and Grades 

Data in the Postsecondary Transcripts Database 34 

5, SAMPLE DESIGN AND IMPLEMENTATION 38 

5.1 Base Year Sample "Design 38 

5.2 1980 Sophomore Cohort Sample Design for 

Second and Third Follow-Up Surveys... 40 

5.3 The Senior Cohort Postsecondary Education 

Transcript Study (PETS) Sample 41 

5.4 The Sophomore Cohort Postsecondary Education 

Transcript Study Sample 41 

5.5 Sample Weights 45 

5.6 Standard Errors and Design Effects 50 

EXHIBITS : 

1.1 Research Design For National Education Longitudinal Studies 3 

3.1 HS6eB Transcripts Study: Data Organization 22 

3.2-3.10 CADE Screens 24-28 

4.1 A Schematic Diagram of the Database Hierarchy Representing 

Nested Transcript Term, and Course Records for Three 

Sample Students 35 



viii 



8 



TABLES : Page 

2.1 Response Rates to the HS&B Postsecondary Education Transcript 
Study by Institutions Types 14 

2.2 Transcript Dispositions 16 

2.3 Return Rates for Participating Schools 17 

2.4 Number of Transcripts Received: HS&B Postsecondary 

Transcript Study 19 

5.1 High School and Beyond Base Year School Sample Selections 39 

5.2 High School and Beyond Base Year Sample Realization 40 

5.3 1980 Sophomore Cohort Second Follow-Up Sample 

Distribution by Race-Ethnicity Typology 42 

5.4 High School and Beyond Sophomore Postsecondary 

Transcript Sample 44 

5.5 Number of Postsecondary Schools Reported by Members of 

the HS&B 1980 Sophomore Cohort 45 

5.6 Nonresponse Adjustments to Sampling Weights for Completed 
Cases in HS&B Sophomore Cohort Postsecondary Education 
Transcript Study (WTl) 48 

5.7 Nonresponse Adjustments to Sampling Weights for Cases 
with At Least One Postsecondary Transcript and Completed 
Questionnaires from the Base Year, First, Second, 

and Third Follow-Up Surveys (WT2) 49 

5.8 High School and Beyond Sophomore Cohort Postsecondary 
Education Transcripts Study Statistical Properties of 

Sample Case Weights 50 

5.9 Distributional Statistics for Design Effects and Root 

Design Effects for 30 Survey Measures for 12 Domains 53 

5.10 Distributional Statistics for Design Effects and Root 
Design Effects for Proportions from Various Survey 

Waves HS&B Sophomore Cohort 54 



APPENDICES: 

Appendix A: List of Endorsing Institutions 

Contents of School Transcript Request Packages 

Appendix B: Course Subject Codes in Numerical Order 

ix 



ERLC 



9 



1 . INTRODUCTION 



The High School and Beyond (HS&B) Postsecondary Education Transcript Study, 
conducted in 1987, involved the collection and processing of school transcripts 
for a subsample of the members of the HS&B younger cohort--that is, the study's 
1980 sophomores- -who had attended any form of postsecondary institution since 
leaving high school. Transcripts were requested from schools reported by sample 
members in their responses to the HS&B second follow-up (1984) and third 
follow-up (1986) surveys. Records were obtained from all types of postsecondary 
institutions, ranging from those offering short-term vocational or occupational 
programs through major universities with graduate programs arid professional 
schools. Information from the transcripts, including terms of attendance, fields 
of study, specific courses taken, and grades and credits earned, was coded and 
procesised into a. system of data files' designed to be merged with HS&B 
questionnaire data files. 

The purpose of the Postsecondary Education Transcripts Study is to provide 
reliable and objective iiiformation about the types and patterns of postsecondary 
courses taken by HS&B sample members since the base year data were collected in 
1980. Because the transcript data file supplements, a large, expanding database 
from the HS&B survey, course -taking patterns and performance can be statistically 
related to a wide range of other factors, including student characteristics and 
occupational and economic outcomes. 

1.1 Overview 

1.1.1 .The NCES Longitudinal Studies Program 

The mandate of the Department of Education National Center for Education 
Statistics (NCES), formerly the Center for Education Statistics (CES) , includes 
the responribility to "collect and disseminate statistics and other data related 
to education in the United States" and to "conduct and publish reports on 
specific analyses of the meaning and significance of such statistics" (Education 
Amendments of 1974-Public Law 93-380, Title V, Section 501, amending Part A of 
the General Education Provisions Act) . 

Consistent with this mandate and in response to the need for policy- 
relevant, time -series data on nationally representative samples of high school 
students, NCES instituted the National Education Longitudinal Studies (NELS) 
program, a continuing long-term project. The general aim of the NELS program is 
to study longitudinally the educat?.onal, vocational, .and personal development of 
high school students, and the personal, familial, social, institutional, and 
cultural factors that may affect that development. 

The overall NELS program uses longitudinal, time -series data in two ways: 

(1) each cohort was surveyed at regular intervals over a span of years, and 

(2) comparable data were obtained from successive cohorts, permitting studies of 
trends relevant to educational and career development and societal roles. Thus 
far, the NELS program consists of two major studies: The National Longitudinal 
Study of the High School Class of 1972 (NLS-72) and High School and Beyond 
(HS&B). (A third major study, the National Education Longitudinal Study of 1988, 
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known as NELS:88, began in 1988 and will continue throughout the decade of the 
1990s.) 

The first major study, NLS-72, began with the collection of comprehensive 
base year survey data from approximately 19,000 high school seniors in the spring 
of 1972. The NLS-72 first foUoC-up survey added to the sample nearly 4,500 
individuals who had been unable to participate at the time of the base year 
survey. Three more follow-up surveys were conducted in the fall and winter of 
1974 i 1976, and 1979, using a combination of mail surveys and personal and 
telephone interviews. The fifth follow-up survey was fielded during the spring 
of 1986. 

The second major survey, HS&B, was designed to inform federal and state 
policy in the decade of the 1980s. HS6d5 began in the spring of 1980 with the 
collection of base year questionnaire and test data on over 58,000 high school 
seniors and sophomores. The first follow-up survey was conducted in the spring 
of 1982, and the seiond follow-up survey in the spring of 1984.' The HS&B thi*.d 
fo] low-up survey was conduct<id in the spring of 1986. 

Three survey cohorts- -NLS-72 seniors, HS&B seniors, and HS&B sophomores- -are 
displayed in Exhibit 1-1 according to their initial and subsequent survey years 
and their modal age at the time of each survey. As shown, the NLS-72 seniors 
were first surveyed in 1972 at age 18 and have been resurveyed five times since, 
with the last survey occurring in 1986 when these young adults were about 32 
years of age. The HS&B cohorts have been surveyed at points in time that would 
permit as much comparison as possible with the time points selected for NLS-72. 
In particular, three types of comparison are possible. 

First, the three cohorts may be compared on a time-lag basis (intercohort or 
intergenerational) . For example, the high school seniors of 1972 and the high 
school seniors of 1980 and 1982 may be contrasted to determine changes over time 
in the composition, distribution, and needs of high school seniors. 

Second, fixed- time comparisons can be undertaken. For a given year, the 
data collection for each cohort can be viewed as a cross-sectional study. It is 
possible, for example, to compare employment rates in 1980 of 16-, 18- and 
26-year-olds. 

The third type of analysis is longitudinal (witnin cohort) and is 
designated in Exhibit 1-1 by the diagonal lines. Because the history of the age 
cohort can be taken into account and modeled, analyses can be designed that 
isolate school and program effects from the effects of differential life 
experiences. 

1.1.2 Relationships Between High School and Beyond and NLS-72 

High School and Beyond was designed to build on the NLS-72 in three ways. 
First, the base year survey of HS&B included a 1980 cohort of high school seniors 
that was directly comparable with the 1972 cohort. Replication of selected 1972 
student questionnaire items and test items made it possible t» analyze changes 
that occurred after 1972 and their relationship to new federal policies and 
programs in education. Second, the introduction of a sophomore cohort provided 
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data on the many critical educational and vocational choices made between the 
sophomore and senior years in high school, permitting a fuller understanding of 
the secondary school experience and its impact on stude .^s. Finally, HS&B 
expanded the NLS-72 focus by collecting data on a range of life cycle factors, 
such as family formation behavior, intellectual development, and social 
participation. 

1.2 History of High School and Beyond 

1.2.1 The Base Year Survey 

The base year survey was conducted in spring 1980. The study design 
provided for a highly stratified national probability sample of over 1,100 
secondary schools as the first-stage units of selection. In the second stage, 36 
seniors and 36 sophomores were selected per school (in schools with fewer than 36 
in either of these grpwps, all eligible students were included). Special efforts 
were made to identify those students within the sample who were twins or triplets 
so that their co- twins or co- triplets could also be invited to participate in the 
study. (Data from non-sampled twins and triplets are not included in the student 
data files, but are available in a separate Twin Data File that links 
questionnaire data for both sam^^led and non-sampled twins for special analyses.) 
Over 30,000 sophomores and 28,000 seniors enrolled in 1,015 public and private 
high schools across the country participated in the base year survey. (Detailed 
information about the samples can be found in the HS&B sample design report for 
the base year: Martin R. Franksl, Luane Kohnke, David Buonanno, and Roger 
Tourangeau, Sample Design Report . NORC, 1981.) 

Certain types of schools were oversampled to make the sample more useful for 
policy analysis. These included: 

■ public schools with high percentages of Hispanic students, to ensure 
sufficient numbers of Cuban, Puerto Rican, and Mexican students for 
separate analysis 

■ Catholic schools with high percentages of minority group students 

■ alternative public schools 

■ private schools with high-achieving students 

The Hispanic supplement to the sample was funded jointly by the Office of 
Bilingual Education and Minority Language Affairs (OBEMLA), and the Office for 
Civil Rights (OCR) within the Department of Education. An additional 
supplementary sample was drawn from students attending Department of Defense 
Dependents Schools (DoDDS) located overseas. DoDDS students are not included in 
the data tapes distributed by NCES, however. 

Survey instruments in the base year included: 

■ senior questionnaire 



■ sophomore questionnaire 



II student identification pages 



■ a series of cognitive tests for each cohort 

■ school questionnaire 

■ teacher comment checklist 

■ parent questionnaire (mailed to a sample of parents 
from both cohorts) 

The student questionnaires focused on individual and family background, 
high school experiences, work experiences, and plans for the future. The 
student identification pages included a series of items on the student's use of 
non- English languages, proficiency in them, and classroom experience in which 
those languages were used. These pages also included information that would be 
useful for locating the students for future follow-up surveys. 

The cognitive tests measured both verbal and quantitative abilities; in 
addition, sophomore tests included achievement measures in science, writing, and 
civics, while seniors were asked to respond to tests measuring abstract and 
nonverbal abilities. Of the 194 test items administered to the HS&B senior 
cohort in the base year, 86 percent were identical to items that had been given 
to the NLS-72 base year respondents. 

School' questionnaires, which were filled out by an official in each 
participating school, provided information about enrollment, staff, educational 
programs, facilities and services, dropout rates, and special programs for 
handicapped and disadvantaged students. The Teacher Comment Checklist provided 
teacher observations on students participating in the survey. The Parent 
Questionnaire elicited information about how family attitudes and financial 
planning affected postsecondary educational goals. 

1.2.2 The First Follow-Up Survey 

The first follow-up sample consisted of approximately 30,000 1980 
sophomores and 12,200 1980 seniors. It retained the multi-stage, stratified, 
and clustered desfign of the base year sample. All students who had been 
selected for inclusion in the base year survey, whether or not they actually 
participated, had a chance of being included in the first follow-up sample. 
Unequal probabilities were compensated by weighting. 

A subsample of 11,500 students was selected from among the senior cohort 
base year participants . This subsampling was carried out to ensure adequate 
analytic power to address policy issues in areas such as excellence in education, 
access to postsecondary education, need for financial aid, and the impact uf 
education on career choices. A special sample of 495 students was selected from 
among those 1980 seniors who had been selected for inclusion in the base year 
survey but who had not actually participated. 
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As in the base year survey, the Hispanic supplement to the first follow-up 
survey was supported by OBEMLA and OCR. In addition, the United States Army 
Recruiting Copnand (USARC) supported the retention in the first follow-up sample 
of 200 additional 1980 seniors who had moderate to high achievement scores but no 
plans for postsecondary education. 

For the senior cohort, a self -administered mail-back questionnaire was the 
basic method of data collection. Approximately 12,200 packets containing survey 
questionnaires, instruction sheets, and incentive payment checks were sent to 
sample members during the first week of February 1982. Approximately 75 percent 
of the targeted senior cohort members completed and returned first follow-up 
questionnaires by mail. An additional 19 percent completed the questionnaires by 
either in-person or telephone interviews. Respondents who completed the 
questionnaire via telephone interview were required to have a copy of the 
questionnaire in front of them while doing so, to keep their survey experience as 
similar as possible to that of the majority of respondents, who filled out the 
questionnaires themselves. Follow-up interviewing was halted in mid- July of 
1982, after a response rate of 94 percent had been obtained. 

First follow-up data for 1980 sophomores were collected through group 
administrations of questionnaires and tests. The sophomore group administrations 
were conducted either in the sampled students' high school or in an appropriate 
location off campus. The location of the administration depended on the suivey 
member's school enrollment status during the data collection period (February 
through May 1982). Group administrations were scheduled off-campus for sample 
members who were no longer attending the sampled schools. These individuals 
(e.g., transfer students, dropouts; early graduates) were contacted by NORC 
survey representatives and brought together in small groups of two to six 
participants. The same survey administration procedures were followed for both 
types of group administration. Follow-up ended in mid-July of 1982, after 
response rates of 81 and 89 percent had been obtained for the questionnaires and 
tests , respectively . 

A first follow-up school questionnaire was requested of all schools selected 
in the base year (including those that had refused to participate), with the 
exception of schools that had no 1980 sophomores, that had closed, or that had 
merged with other schools in the sample. Schools that had received en masse 
transfers of students from base year schools were contacted to complete a first 
follow-up school questionnaire and to arrange student survey activities. These 
schools are not considered to be part of the probability sample of secondary 
schools and do not appear on the Updated School Data File. The first follow-up 
survey also included a sample of students from the Department of Defense 
Dependents Schools (DoDDS). DoDDS students were not part of the main probability 
sample and were not weighted. 

1.2.3 The Second Follow-Up Survey 

The sample design for the second follow-up survey was the same as that used 
for the first follow-up. Survey activities were initiated for all individuals 
who had participated in the first follow-up except for those who were known to be 
deceased. 



As in the first follow-up survey, mail-back questionnaires were again the 
basic method of data collection for the seniors and, in this follow-up, for the 
sophomores as well. During the first week of February 1984, approximately 12,000 
packets of survey materials were mailed to the last known addresses of the senior 
sample members and approximately 14,825 sophomore sample members. Extensive 
telephone prompting was used to encourage sample members to respond by mail. 
When this failed, interviews were conducted by telephone or in person. 

Approximately 73 percent of the senior cohort sample members mailed back 
their completed questionnaires; about 13 percent were interviewed by telephone; 
and about 5 percent were interviewed in person. Among the sophomores 
approximately 73 percent mailed back their completed questionnaires; about 14 
percent were interviewed by telephone; and about 5 percent were interviewed in 
person. As in the earlier follow-up, the survey design required that respondents 
who were to be interviewed over the telephone or in person have a copy of the 
questionnaire before them during the interview, to minimize bias due to the 
method of administration* Follow-up interviewing continued through 'July 1984, 
and resulted in a completion rate of over 91 percent for the seniors and 92 
percent for the sophomores. 

1.2.4 The Third Follow-up Survey 

As in the second follow-up, mail -back questionnaires were the basic method 
of administration, supplemented by telephone and in-person interviews. During 
the last week of February 1986, approximately 26,820 packets of survey materials, 
were mailed to the last known addresses of the sample members (senior and 
sophomore). Reminder/thank you postcards were mailed to respondents after two 
weeks. Telephone prompting started three weeks later. When this failed to elicit 
a response, an effort was made to complete the case by telephone. The final 
attempt was made through in-person interviewing. 

Follow-up interviewing continued into September, resulting in a completion 
rate of 91 percent among sophomores, of 88 percent among seniors, and an overall 
completion rate^ of 90 percent. 

1.3 Related Studies 

In addition to the core surveys described above, a number of related studies 
have been undertaken. Besides the transcript study described in this manual, 
such studies have included the collection of the high school transcripts and 
postsecondary financial aid data for the HS&B sophomore cohort, and the 
collection of postsecondary education transcripts and financial aid data for the 
HS&B seniors. Data files for these studies and other HS&B data, such as parent 
surveys, school surveys, teacher comments, etc. are described below. Users' 
manuals or other forms of documentation are available from NCES for all data 
files. These auxiliary data files greatly expand the analytic potential of the 
core data sets, and researchers are encouraged to become familiar with them. 

1.3.1 Other Base Year Files 

The Language File contains information on each student who during the base 
year reported some non-English language experience, either during childhood or at 
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the tiipe of the survey. This file contains 11,303 records (sophomores and 
seniors combined), with 42 variables for each student. 

The Parent File contains questionnaire responses from the parents of about 
3,600 sophomores and 3,600 seniors who are on the Student File. Each record on 
the Parent File contains a total of 307 variables. Data on this file include 
parents' aspirations and plans for their children's postsecondary education. 

The Twin and Sibling File contains base year responses from sampled twins 
and triplets; data on non- sampled twins and triplets of sample members; and data 
from siblings in the sample. This file (2,718 records) includes all of the 
variables that are on the HS&B student file, plus two additional variables 
(family ID and SETTYPE-- type of twin or sibling). 

The Sophomore Teacher Comment File contains responses from 14,103 teachers 
on 18,291 students from 616 schools. The Senior Teacher Comment File contains 
responses from 13,683 teachers on 17,056 students from 611 schools. At each 
grade level, teachers had the opportunity to answer questions about HS&B sampled 
students who had been in their classes. The typical student in the sample was 
rated by an average of four different teachers. These files contain 
approximately 76,000 teacher observations of sophomores and about 67,000 teacher 
observations of seniors. 

The Friends' File contains identification numbers of students in the HS&B 
sample who were named as friends of other HS&B sampled students. Each record 
contains the IDs of sampled students and IDs of up to three friends. Linkages 
among friends can be used to investigate the sociometry of friendship structures, 
including reciprocity of choices among students in the sample, and to trace 
friendship networks. 

1.3.2 Other Special Studies Files 

The Hiph School Transcript File describes the course- taking behavior of 
15,941 sophomores of 1980 throughout their four years of high school. Data 
include a six- digit course number^ for each course taken, along with course 
credit, course grade, and year taken. Other items of information, such as grade 
point average, days absent, and standardized test scores, are also contained on 
the file. 

The Offerings and Enrollments File contains school information, course 
offerings, and enrollment data for 957 schools. Other information, such as 
credit offered by the school, is also contained on each record. 

The Updated School File contains base year data (966 completed 
questionnaires) and first follow-up data (956 completed questionnaires) from 
1,015 participating schools in the HS&B sample. First follow-up data were 



Corresponds with descriptions in A Classification of Secondary School 
Courses (CSSC) , developed by Evaluation Technologies, Inc., under 
contract with NCES, July 1982. 
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requested only from those schools that were still in existence in the spring of 
1982 and had members of the 1980 sophomore cohort currently enrolled. Each high 
school is represented by a single record that includes 230 data elements from the 
base year school questionnaire, if available, along with other information from 
sampling files (e.g., stratum codes, case weights). 

The Postsecondary Education Transcript File for the HS&B seniors contains 
transcript data on dates of attendance, fields of study, degrees earned, and the 
titles, grades, and credits of every course attempted at each school attended, 
coded into hierarchical files with the student as the highest level of 
aggregation. Although no survey forms were used, detailed procedures were 
developed for extracting and processing information from the postsecondary school 
transcripts that were collected for all members of the 1980 senior cohort who 
reported attending any form of postsecondary schooling in the first or second 
follow-up surveys. (Ch/er 7,000 Individuals reported over 11,000 instances of 
school attendance.) 

The Senior Financial Aid File contains financial aid records from 
postsecondary institutions which respondents reported attending, and federal 
records of the Guaranteed Student Loan program and of the Pell Grant program. 

The Sophomore Financial Aid File contains information from federal records 
from the Guaranteed Student Loan program and from the Pell Grant program for all 
students who reported attending postsecondary education and who had participated 
in either of these two programs. 

The HS&B HEGIS and PSVD File contains the postsecondary school codes for 
schools HS&B respondents reported attending in the first and second follow-ups. 
In addition, the file provides data on institutional characteristics, such as 
type of institution, highest degree offered, enrollment, admissions requirements, 
tuition, and so forth. This file permits analysts to link HS&B questionnaire 
data with institutional data for postsecondary schools attended by respondents. 

1.3.3 Merged Base Year and First Follow-Up Files 

The First Follow-Up Sophomore File contains responses from 29,737 students 
and includes both base year and first follow-up data. This file includes 
information on school, family, work experiences, educational and occupational 
aspirations, personal values, and test scores of sample participants. Students 
are also classified as to high school status as of 1982 (i.e., dropouts, same 
school, transfer, or early graduate). 

The First Follow-Up Senior File contains responses from 11,995 individuals 
and includes both base year and first follow-up data. This file includes 
information from respondents concerning their high school and postsecondary 
experiences and their work experiences. 

1.4 Scope of the Postsecondary Education Transcript Studies 

Although the HS&B follow-up surveys have collected longitudinal data on 
postsecondary educational activities of sample members, the kinds and quantity of 
information collected on course- taking patterns and on grades, credits, and 
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credentials earned has necessarily been limityd by the survey methodology, and by 
respondents' ability to recall the details of their educational experiences. 

To overcome these weaknesses and to provide a rich resource for the future 
analysis of occupational and career outcomes, the Postsecondary Education 
Transcript Study (senior and sophomore) was designed to obtain official records 
from academic and vocational schools. Transcript information was abstracted and 
coded into machine-readable form,, and can thus be merged with questionnaire data 
and other records data (e.g., information from student financial aid records) to 
support powerful quantitative analyses of the impacts of postsecondary schooling. 

Data files created for the transcript study include detailed information 
about program enrollments, periods of study, fields of study pursued, specific 
courses taken, and credentials earned. In addition to providing a data resource 
for the analysis of educational activities and their impacts, the transcript data 
may be used as an objective standard against which student self-reports may be 
compared and evaluated, thus guiding the design of future studies. 

Transcript requests for the Sophomore Cohort Postsecondary Transcript Study 
were made for a sample of the sophomore cohort students who reported in the 
follow-up survey that they had attended a postsecondary institution (see Chapter 
5, Sample Design and Implementation). Requests were made for 7,429 transcripts 
to 2,139 schools. For some of the 6,098 sampled students, multiple requests were 
made . 

2. DATA COLLECTION 

Planning for the Sophomore Cohort Postsecondary Education Transcript Study 
began in the winter of 1987. Preparations for data collection included three 
major steps: 

1. Extracting information concerning each unique instance of postsecondary 
school attendance by younger cohort members from HS&B follow-up survey 
data files, and sorting this information by institution name and 
identification number. This data file was used to generate the printed 
lists of students sent to registrars and other school administrators to 
request transcripts. 

2. Constructing up-to-date address files for all postsecondary 
institutions reported by sample members, and developing letters, forms, 
and other materials to be sent to school administrators explaining the 
purposes of the study, the legal authority under which the study was 
being conducted, and procedures for protecting the confidentiality of 
research subjects. 

3. Obtaining the endorsement and support of a broad spectrum of 
professional organizations engaged in research about and representing 
the interests of postsecondary institutions. Appendix A contains a 
list of sixteen organizations endorsing the study and encouraging its 
members to cooperate in data collection activities. 
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2.1 Data Collection Objectives 

% 

The principal objective of the study was to obtain from institutions of 
interest reported by a sample member the formal transcripts or other equivalent 
records of their educational activities (i.e., documents authenticating 
enrollment and attendance in postsecondary programs, indicating academic or other 
types of performance, and showing any formal credits and credentials earned). In 
addition, course catalogs and other related publications were requested from 
these schools to facilitate the accurate and consistent coding of information 
about programs or fields of study, course titles, earned credits, grades, degrees 
or other credentials, and academic terms or other measures of enrollment 
duration. 

A total of 7,429 transcripts were requested from 2,139 schools for 6,098 
individuals (see Chapter 5). Transfer credits coded from a second school's 
transcripts have been systematically flagged in the data files so that analysts 
seeking to^ cumulate credits earned may easily avoid double -counting. 

A secondary objective of the transcript study was to validate reports by 
sample members of school enrollment in their responses to follosz-up surveys. 
Thus, transcripts were "requested from each school reported in follow-up 
questionnaires, even if there was evidence that the respondent might not have 
completed the term of study or the requirements for credit. As indicated by the 
results described below, in a small but significant percentage of cases, 
institutions reported that the respondent either never actually attended classes 
at the named school, or else dropped out of cldsses before completing enough work 
to justify the creation of a formal record. 

2.2 Mailout of Transcript Requests to Institutions 

During the week of June 15, 1987, packets of transcript survey materials 
were mailed to the postsecondary schools. The mailing was timed to arrive at 
registrars' or other administrative offices at the time of the lowest level of 
activity for the administrative staff. The requests were received after the 
first activity associated with graduation and transfer and prior to expected 
heavy work schedules associated with fall enrollments. 

Altogether, 7,429 transcripts were initially requested from 2,139 
institutions for 6,098 HS&B sample members. Each transcript request package 
contained the following, of which examples are provided in Appendix A: 

■ a list of postsecondary school organizations endorsing the transcript 
study 

m a letter to the Registrar or Director of Admissions from the NORC 
Director of Education Longitudinal Studies 

■ a letter of endorsement from the American Association of 
Collegiate Registrars and Admissions Officers (AACRAO) 
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■ a letter from the Director of the Center for Education Statistics 
authorizing NORC to conduct the study on behalf of the Secretary of 
Education 

■ an excerpt from the Family Educational Rights and Privacy Act 
(FERPA) indicating the legal authorization under which the request 
for records was made 

■ a brief description of NCES's National Education Longitudinal Studies 
program 

■ general instructions for participation in the study 

■ a computer- generated list of students for whom transcripts were .being 
requested 

■ a label to affix to each transcript to link the correct transcript to 
HS&B files2 

■ a transmittal form with instructions^ 

■ an invoice form for transcript reimbursement^ 

■ pre-paid envelopes for transcript shipment. ^ 

Telephone follow-up of non- responding schools began in July when transcript 
had been received from about 45 percent of the schools. Over the course of the 
data collection period, 1,082 follow-up calls were made to schools. Below is a 
breakdown to illustrate the level of effort required to obtaii. tjranscripts from 
small number of schools. 



Number of Schools Number of Calls 

384 1 
204 2 
131 3 

265 . 4-6 
98 over 6 



Frequent changes of personnel, referrals to alternative administrators or 
sites, and problems with the typical pace of internal mail delivery systems in 
some schools resulted in the need to remail a total of 551 transcript- request 



Copies not included in the appendix. 

12 



packets*. Of these, approximately 150 schools required a second re -mailing and 
anoth*^" handful required 3 remails. 

2.3 Data Collection Results 

To a great degree, the success of the transcript study hinged upon the 
cooperation of registrars and other administrators to whom transcript requests 
were sent. Although 93 percent of the schools were asked to supply fewer than 10 
student transcripts, 70 was the largest number from a single school. Despite the 
fact that transcript requests were^ made with the express written consent of 
participating subjects and photocopies were provided to schools on request, and 
despite the fact that study materials fully explained the legal basis for the 
requests for the information, school officials had the right to decline to 
cooperate. Most officials supported the objectives of the study, however, and 
were both prompt and complete in their responses. Even so, other logistical 
obstacles had to be overcome. A small number of schools, all in the vocational 
and proprietary sector, had permanently closed, eliminating access to olaer 
records. Other schools had relocated, changed their names, or merged with other 
institutions, necessitating extensive tracing efforts in order to deliver 
requests to appropriate offices, and complicating the task of locating specific 
student records. In the following sections we describe the response rates at 
three levels--the institution, the individual transcript (instance of 
attendance), and the student (for whom more than one transcript may have been 
requested) . 

2.3.1 The School-Level Response Rate 

Transcript requests for HS&B students were sent to a great variety of 
postsecondary school types, including small and large private vocational and 
proprietary schools as well as traditional degree-granting institutions of 
higher education such as 2- and 4-year colleges and universities with the full 
range of graduate and professional programs. Identical materials and procedures 
were used in the collection of transcripts from all types of schools. However, 
as shown in Table 2-1, proportionally more non-vocational institutions (e.g,, 
coileges and universities) participated in the study than did their vocational 
co.unterparts (e.g., trade and technical schools). The participation rates shown 
in the table are the simple percentages of schools in each sector that returned 
at least one transcript. No attempt was made in this table to adjust either for 
the number of transcripts requested or for the possibility that only one 
transcript was requested for a student who did not actually attend the school. 

In the proprietary sector, only about 63 percent of the schools returned any 
transcripts. The sector, however, constituted only about 16 percent of the. list 
of schools. 

Schools in the other sectors were much more likely to return one or more 
transcripts, as is demonstrated in Table 2-2. These other types of schools 
constituted approximately 84 percent of the list of schools attended, and account 
for nearly 93.4 percent of the transcripts requested. 
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Table 2-1 

Response Rates to the HS&B Postsecondary 
Education Transcript Study By Institution Type 





Institution 


type 






Proprietary 


Private 

technical 

2-year 


Public Public Private 
technical 2 -year/ 4 -year 
2 -year jr. college 


Public 


Total 

SCIIUOXS 


Percent 62.7% 


84.3% 


78.9% 


93.1% 91.6% 


95.9% 


87.0% 


Number of 
of schools 

in sector (N) (341) 


(89) 


(157) 


(479) (608) 


(A65) 


(2139) 



The higher response rates for the public and private non-profit schools may 
be attributable in large part to the typically longer period during which they 
have been in existence, and to the relative permanence of student files they 
maintain. The most common reasons reported by school personnel for being unable 
to return transcripts were that the records had been lost or destroyed (about 8 
percent of transcripts requested from schools in the proprietary sector), or that, 
there was no record at the school of the named student's attendance (11.8 percent 
of the transcripts requested from these schools). An additional 1.9 percent of 
the proprietary transcripts requested were unavailable because of school 
closures. However, 15.9 percent of the proprietary schools did not respond 
despite assurances that they would do so. 

In most cases, schools that returned transcripts also returned other 
related doctunents (e.g., bulletins and course catalogs) to assist, coding. 

2.3.2 The Transcript-Level Response Rate 

Table 2-2 shows data collection results at the level of the individual 
transcript for the total sample, and separately for each of the six types of 
postsecondary institution. Transcript response rates are calculated as ratios of 
the numbe/ of transcripts received to the number of **in-scope" transcript 
requests. Of the 7,429 transcripts initially requested, 396 were classified as 
"out-of-scope" as a result of information returned by school personnel indicating 
that the individuals for whom transcripts were requested never attended their 
schools (or did not complete enough work to generate a formal record). Given 
this response by school administrators, these cases (transcripts) have been 
treated as outside the population of events being studied rather than as "missing 
observations.** (Duplicate transcripts received from two branches of the same 
school were also classified as out-of -scope. They accounted, however, for less 
than one percent of all transcripts and had no effect on the outcome.) The 
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implications of this definition of ''out-of-scope" transcript requests for 
interpreting the transcript data are discussed below. 



Of the 7,033 "in-scope*» transcripts requested, a total of 6,536 (92.9 
percent) were returned to NORC for processing. Response rates varied from 95.4 
percent for transcripts sought from public community and junior colleges to a low 
of 69.1 percent from the proprietary schools. Rates were uniformly high (95.4 to 
94.7, 94.4 percent) from the three large strata (public community and junior and 
4-year colleges and private 4-year schools). Returns were substantially lower 
from the strata of technical and proprietary schools. 

Table 2-3 below, however, illustrates the exceptionally high rate of 
response at the transcript level among those schools that returned at least one 
transcript. The number of transcripts as a percent of those requested ranged 
from a low of approximately 95 percent for public technical 2 -year schools to 
over 99 percent for the private technical 2-year schools. 

As can be seen in table 2-2, reasons for non-return of transcripts varied 
among institution types. School refusal accounted for just under 1 percent of 
missing transcripts. Confirmed school closings affected only 12 transcripts. 
Overall, just under 2 percent of transcripts were not available because records 
had been lost or destroyed, but among proprietary schools 7.9 percent were in 
this category. The remaining category (No Response) includes transcripts from 
one school for which no current mailing address could be found (and which may 
have been closed), schools that could not be successfully contacted by telephone, 
and schools that expressed the intention to return transcripts but did not do so 
in time for processing. Also included in this category are unre turned 
transcripts from schools that did return a portion of the transcripts requested. 
Reasons for partial returns varied from clerical oversight in schools that were 
asked to provide large numbers of transcripts, to cases in which schools would 
not release a record because the student had not paid all his outstanding fees, 
and the like. 

Table 2-2 above also shows that in 396 instances (just over 5 percent of the 
total of 7,429 requests), school officials reported explicitly either that the 
specified student had never attended the school or that the student had not 
stayed long enough to earn any grades or credits, and therefore had no formal 
records. The percentage of this type of outcome varies little across the three 
major strata of non-vocational or technical schools, but increases to about 14 
percent of the public technical 2-year schools, and accounts for about 11 percent 
within the proprietary sector. For purposes of the transcript study, these cases 
were considered out-of -scope : they are **non-events, " or at the very least they 
are outside of the population of events under study. 

Since the initial list of instances of school attendance was created using 
survey responses to the HS&B second and third follow-up surveys, these results 
create inconsistencies between the que.itionnaire data files and the postsecondary 
transcript study data file. The discrepancy between student-reported 
postsecondary attendance and the evidence in school records is substantial, and 
so the decision to consider these instanct^s as out-of-scope was not taken 
lightly. It is important to note that this status code was only assigned to 
cases in the survey monitoring system when school officials confirmed in writing 
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Table 2-2 
Transcript Dispositions 



Institution Type 







Private 


Public 


Public 












technical 


technical 


community/ 


Private 


Public 






Propr ietary 


2*year 


2*year 


1 r mil 


•f jr veil 






Received 


69* IX 


92. IX 


88*7X 


95*4X 


94 •4X 




0? 0% 




(295) 


(139) 


(250) 


(1,336) 


(1,527) 


(2,989) 


(6,536) 


School 


K6X 


2.0X 


2.8X 


0*6X 


KOX 


0.7X 


1.CX 


refused 


(7) 


(3) 


(8) 


\ ' / 


f 16) 


\ c*» / 




Lost or 


ft nv 

0* Ua 


1 •ilk 


?*0X 


0*oX 


1 •2X 


1 *3X 


1 •8X 


destroyed 


(34) 


(2) 


(U) 


(11) 


(20) 


(46) 


(127) 


School 


1.9X 


0.7X 


0.4X 


O.OX 


o.ox 


0*1X 


0*2X 


closed 


<8) 


(1) 


(1 ) 


(0) 






\ I c / 


No 


19*4X 


4.0X 


3*2X 


3*1X 


3*4X 


3*0X 


4*1X 


response 


(83) 


(6) 


(9) 


(U) 


(55) 


(94) 


(291) 


In-scope 


100. OX 


100.0'/; 


100*0X 


100*0X 


100.0X 


100*0X 


lOO^OX 




(427) 


(151) 


(282) 


(1,400) 


(1,618) 


(3,155) 


(7,033) 


Never 


11. 8X 


7.4X 


U*3X 


4*2X 


5.5X 


3.8X 


5*3X 


attended 


(57) 


(12) 


(47) 


(62) 


(54) 


(124) 


(396) 



their conclusion that the named student did not attend their school. 
Administrators had considerable information about each student named on a 
transcript request form, including full names, alternative names such as maiden 
names, social security numbers, dates of birth, and approximate dates of 
enrollment. In addition, there was considerable evidence in the materials 
returned to NORC that school personnel had conducted thorough searches for 
records, and often had cross-checked their results with admissions offices and 
financial aid offices. We therefore believe that there is little or no 
classification error in this status code* 

One interpretation of this outcome is that HS&B respondents over reported 
instances of postsecQndary school attendance by over 5 percent of the events 
(unweighted). If so, researchers analyzing postsecondary schooling using only 
the survey data tapes would overestimate significantly the extent of this 
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Table 2-3 

Return Rates for Participating Schools 



Institution Transcripts Transcripts 

Type Requested Received Percent 



Proprietary 


302 


295 


97 


68% 


Private tech. /2-year 


140 


139 


99 


29% 


Public tech. /2-year 


262 


250 


95. 


42% 


Public 2-yr./Jr. coll. 


1,355 


1,336 


98. 


60% 


Private 4 -year 


1,561 


1,527 


97. 


82% 


Public 4-year 


3,056 


2,989 


97, 


81% 


TOTAL 


6,676 


6,536 


97. 


9% 



activity. Furthermore, the true discrepancy may be even bigger than thac 
estimated by these results. In approximately half of che 293 cases in the "No 
Response" category of Table 2-2, neither transcripts nor any other information 
about the students' status was returned. In the absence of specific information 
to the contrary, these cases have been treated as probable instances of 
attendance, and therefore within the scope of the population of interest. It 
isreasonable to expect that if information had been obtained fox these cases, ' 
some portion would have been declared as errors in reported attendance. 

The fact that the rate of •'Never Attended" classifications is tvice as high 
among proprietary and public technical/2-year school? as in other sectors is 
consistent with descriptions of the incidence of last-minute withdrawals- and 
dropout rates at these schools, adding face validity to this view. 

However, we do not believe that the evidence is strong enough to rule out 
alternative interpretations. One reasonable possibility is that many of these 
instances of reported attendance result from errors in the coding of incomplete 
or marginally legible school names written by respondents into survey 
questionnaires. Conceivably, then, respondents may have in fact attended some 
form of postsecondary school, but the data in the questionnaire files may be 
wholly or partially inaccurate for these individuals. If this were true for each 
discrepant case, then the questionnaire files would accurately reflect the extent 
of postsecondary educational activity, but would include measurement errors 
concerning the specific school attended. 

A third alternative seems to us equally persuasive. Although there were 396 
transcript classifications of non-attendance, only 229 individual sample members 
were classified as out-of -scope as k result. Of these 396 transcripts, 58.3 
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percent (231) were requested for the 229 out of-scope students.^ However, the 
229 students represent only 3.8 percent of the total sample of individuals 
(6,098). Only one transcript was requested for 227 of these individuals and two 
transcripts each for two individuals. For these 229, school officials returned a 
report of no or insufficient attendance. 

Although a detailed analysis has not been possible, it is conceivable that 
many of these individuals may have attempted to report the same institution in 
both HS&B follow-up surveys, but in one instance returned a low-quality response 
resulting in a coding error for one of the two reports. If a transcript was 
returned for the correctly coded school, a thorough analysis of its contents 
could shed considerable light on the nature of the apparently erroneous report. 
The contract for data collection and processing did not include support for this 
type of analysis. However, the public use data files contain data records for 
all 6,098 sample members for whom transcripts were requested (including the 229 
classified as out-of-scope) , and include separate records for each transcript 
requested (including the school identifiers fAf the 396 transcripts classified as 
out -of- scope ) , thus providing researchers with all the material needed to fully 
assess the issues of measurement error. The variable FINDISP stored on each of 
the transcript-level records identifies out-of-scope transcripts (and sample 
members) for further analyses. 

R'^searchers should note, however, that the adjusted weights attached to the 
transcript file apply to individuals, not transcripts. Thus, adjusted weights 
are attached only to the 5,869 "in- scope" sample members. 

2.3.3 Student-Level Data Collection Results 

Transcripts were sought for 6,098 selected (see Chapter 5) HS&B 1980 
sophomore members who reported attending postsecondary schools since leaving high 
school. Reports of postsecondary attendance were obtained from HS&B second and 
third follow-up survey questionnaire responses. To be eligible for the 
transcript study, respondents must have provided specific information (i.e., the 
name and, desirably, the city and state) about at least one of the postsecondary 
schools attended. As described. above , reports from school personnel indicated 
that 229 individuals who reported attending only one or two postsecondary schools 
had not actually attended those schools (or had not completed enough work to have 
established a formal record) . 

Table 2-4 presents distributions of the number of transcripts received for 
each student. Excluding the 229 out-of-scope cases, one or more transcripts were 
obtained for 94.3 percent of the 5,869 enrollees. A single transcript was 
received for 4,620 cases (78.7 percent of thi^s group). Two transcripts were 
processed for 829 individuals (14.1 percent) and three or more transcripts were 
obtained for 84 sample members (1.5 percent). 



Multiple transcripts were requested for many individuals for whom some 
or all transcripts may have out-of-scope. Thus, an individual could 
have both in-scope and out-of-scope transcripts requested for them. 
See Section 2.3 for further detail. 
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Table 2-4 

Number of Transcripts Received: HS&B Postsecondary 
Education Transcript Study 



Number of Number of Percent of Percent of 

transcripts respondents in-scope respondents total respondents 



None (in-scope) 


335 


5.7. 


5.5 


One 


4,620 


78.7 


75.8 


Two 


829 


14.1 


13.6 


Three 


78 


1.4 


1.3 


Four 


6 


.1 


.1 


Total in-scope 


5,869 


100.0 


96.3 


None (out- of -scope) 


229 


NA 


3.8 


TOTAL SAMPLE 


6,098 


96.2 


100.1 



In addition to collecting multiple transcripts per case, many transcripts 
contained information about credits transferred from other schools. Transfer 
credits were specially flagged in the data files to assist researchers in 
avoiding double -counting of earned academic credits by those who attended more 
than one school. Transfer credits for 5,533 individuals have been documented in 
their transcript records. The variables TRNSFERS on the student- level record and 
TRNSFERT on the transcript- level record in the data files identify individuals 
and transcripts containing transfer credits. 

3. DATA PREPARATION 

3.1 Data Preparation Objectives 

The diversity in structure and concents that exists among the transcript 
records reflects the great variability among the schools from which they were 
obtained. Although transcripts from public and private 2-year and 4-year 
colleges were generally similar with respect to the data they contained, for 
example, they nevertheless differed in their physical layout and in the 
terminology used for identical or related concepts. Early in the design stage 
for the Senior Cohort Postsecondary Transcript Study, it became apparent that the 
superficial similarities in many transcripts give way to countless differences in 
the ways in which academic progress is measured and recorded. This is especially 
true of course grades and credits. 

The variability across institutions in the details of transcript information 
defies any simple reconstruction or homogenization. Virtually any element on an 
academic transcript, including such seemingly straightforward items as course 
titles, may be subject to highly particularized local conventions whose logic may 
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be independent of, or even contravene, common practices. For example, it is not 
uncommon to find courses in English composition merged with other content and 
carrying formal names suggesting that they belong in the social science 
curriculum. Such instances, by no means rare, were resolved by Computer- Assisted 
Data Entry (CADE) staff, who consulted program- of- study catalogs and descriptions 
of courses obtained from the postsecondary institutions. 

Even more problematic was the issue of standardizing metrics for such 
typical transcript elements as grades or credits. For example, the notion that 
one school's grading or credit system can be equated to another's by a simple 
linear transformation of scores may have been defensible for secondary school 
grades in the sophomore .cohort high school transcript study. Attempting the same 
sort of "equating" with postsecondary school grades and credits carries the risk 
of introducing systematic errors into complex analyses. 

In preparing the data for conversion to standardized, machine -readable data 
files, NQRC's approach was to impose a common structure and organization on the 
transcript information, but to preserve to the extent feasible the actual 
information contained in the original documents. Thus, grades and credit values 
are stored as they were reported, and have not been transformed to any common 
metric. Such fields as degrees and credentials earned, major and minor fields of 
study, and titles of courses taken have been assigned numeric codes as explained 
below, but also have been recorded exactly as they were reported on the 
transcripts. 

This approach places some additional burden upon transcript data users to 
gain familiarity with the variability across institutions and sectors in the data 
values stored in such fields as grade point averages, course grades, and credits. 
Our exposure to these data during their collection and processing leads us to 
conclude that in order to use these complex files effectively in educational 
research, each analyst should make a detailed assessment of the properties of all 
transcript elements of interest. 

As is described in deczail below, data preparation was carried out by a staff 
of 10 specially trained coders under the guidance of a supervisor and the data 
preparation manager. The data preparation task included analyzing the transcript 
document to determine its general organization and special characteristics, 
abstracting standard information from the highly varied documents into a common 
format, and assigning standard numerical codes to such transcript data elements 
as major and minor fields ol study, degrees earned, types of academic term, 
titles of courses taken, grades, and credits. 

3.2 Data Organization 

Transcript data were organized into a four- level hierarchy consisting of 
data at the student, transcript, term and course levels. (See Exhibit 3-1.) 
At least one student- level and one transcript- level record is provided for each 
sample member for whom a transcript was requested, even if the school reported 
that the individual had never attended, or had withdrawn before establishing a 
formal record. Records in this category are flagged with ^a special disposition 
code. (See Chapter 2 above for a discussion of out-of-scope cases.) 
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Student-level data refer to general information about the responde *s 
educational career. All recorcls are assigned case ID codes, allowing merger of 
transcript data with other files (term and course), relevant questionnaire data 
from the HS&B base year and follow-up surveys (e.g., self -reported high school 
program), high school grades, composite and derived variables from survey data, 
(base year SES quantities, achievement test quantities, etc.), data on the 
respondent's high school, sampling weights, and data summarizing information 
found on transcripts for all postsecondary schools attended (e.g., an educational 
activity status measure for several points in time between 1981 and 1987). 

Transcript- level records contain data pertaining to a student's academic 
record at a single institution, including the institutional ID code (FICE code 
or vendor number), degree(s) or other credentials conferred with accompanying 
dates, major and minor field(s) of study, and the student's cumulative grade 
point average (GPA) . 

The term-level of the hierarchy contains information describing specific 
units of instruction. Term records usually refer to commonly understood academic 
terms such as quarters, trimesters, or semesters. Term- level records include the 
the type of term, season, start and end dates, the type and characteristics of 
the grade scale employed during the term (e.g., letter or nxameric scoring), the 
ntjmber of courses associated (and hence the nxamber of course- level records 
attached) with a term, and a special flag indicating regular or transfer status 
for the term. The term type flag includes a code denoting credit for major 
standardized tests (e.g., CLEP, LSAT) as well as work and other life experiences 
for which credit is given. 

Course- level records store the data for each course taken by a student 
during a specific term. The formal title of the course was entered verbatim 
from the transcript, then assigned one of 78 academic or vocational program codes 
based on those contained in the publication, A Classification of Instructional 
£r££r^s (Malitz , G.S., et al.; Washington, D.C.: National Center for Education 
Statistics, U.S. Department of Education, 1981, hereinafter referred to as 
"CIP**). The 78 instructional program codes employed in this study included 41 
major program areas (2-digit), 20 program sub-groupings (4-digit) and 17 
individual programs (6-digit). An additional code was reserved to indicate 
lump-sum transfer course credit. A list of the 78 program classifications 
and their related CIP codes is included as Appendix B. Also entered were 
credits attempted and the grade received by the student for each course. 

3.3 Computer- Ass is ted Data Entry (CADE) 

3.3.1 CADE Concept 

In a conventional survey, the major data preparation tasks, 
editing/coding and data conversion, are performed in sequence by different 
individuals. The editor-coder follows a set of defined procedures to 
select, classify, and systematize data. The edited and coded documents are 
then given to data conversion operators for efficient, accurate conversion 
of the data to machine -readable form. Usually, the training and skills most 
appropriate for a coder differ considerably from those of a data conversion 
operator. 
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EXHIBIT 3*1 
NSiB Transcripts Study: Data Organization 



I* Student- level record 

- Student ID 

* Numbers of transcripts requested 

- Numbers of transcripts received 

* Transcript data indicator 

* Transfer courses flag 

* Survey data and composite variables from student data files: 

Socio*demographic variables 

Characteristics of secondary school attended 

Base year and follow*up study test scores 

* Postsecondary school enrollment status indicators 

- Sampling weights 

II. Transcript- level record 

* Student ID 

* School ID (FICE or vendor number) 

* IPEDS number 

Final disposition of transcript requests 

- Postsecondary school census region 

- Postsecondary institution type 

* Sequence number 

Number of terms per transcript 

* Degree awarded: 

Type of degree 

Verbatim degree text 

Date degree conferred (month and year) 

* Cumulative grade point average 

* Field(s) of study: 

Verbatim text*major 

Major instructional program code 

Verbatim text-minor 

Minor instructional program code 

III. Term- level record 

* Student ID 

* School ID (FICE or vendor number) 

* Transcript number 

Term number within transcripts by SORTDATE 

* Date of term (month or season and year) 

* Institutional context of term (transfer or non^transfer term flag) 

* Type of term: 

Types of academic terms 

Quarter^ trimester^ semester^ variable length 
Types other than academic terms 

Test terms, other than test terms 

* Grade scale type in effect during term: 

Letter grade scale 
Numeric grade scale 

Highest grade possible 

Lowest grade possible 

Minimum passing grade 

IV. Course- level record 

Student ID 

* School ID (FICE or vendor number) 

* Transcript number 

* Term number 

* Grade received for course 

Letter grade for course 

Numeric (0-100) grade for course 

Numeric (0*4) grade for course 

* Credits attempted for course 

* Verbatim text of course title 

* Course program code 

* Denotes data recorded from transcripts using CADE. 

-^^ Denotes data derived from transcripts but not entered directly. 

- Denotes data merged from other data sources. 
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The HS6eB Postsecondary Transcripts Study required abstracting, coding, 
and organizing data from over 6,500 forms that varied greatly in appearance 
and content. Compared to the typical survey questionnaire, the amount of 
data to be keyed per transcript was very small. The majority of the coding 
task involved the assignment of Course and Major/Minor program codes, 
selected from a rather complex taxonomy. Previous experience on complex 

data abstraction studies involving small amounts of keyed data had shown 
that reasonable efficiency gains could be expected by combining coding and 
data conversion. Guided by this experience, NORC successfully modified its 
proprietary Computer-Assisted Data Entry (CADE) system to accommodate the 
data processing of postsecondary transcripts. 

For the purposes of the HS&B Postsecondary Transcripts Study, a single 
member of the coding staff reviewed a transcript for all relevant, in- scope 
data, classified those data, and entered the data into a computer file. 
Combining these steps ensured that transcripts would be handled as 
internally consistent, integrated records of an individual's educational 
activity. Moreover, since all transcript processing occurred at a single 
station, the use of CADE reduced the number of steps at which records might 
be lost or misrouted, or other errors introduced into the database. 

3.3.2 CADE Equipment: Hardware and Software 

The CADE program used in this study was prepared at NORC using the 
fourth -generation database language Metafile on the IBM-compatible Corona 
microcomputer. Each of 10 CADE operators was assigned to a microcomputer 
station for transcript processing. The CADE program prompted the operator, 
through a series of defaults, for entry of all of the data elements 
requiring entry (i.e., all data elements marked ***** in Exhibit 3-1). The 
program repeated this cycle through the transcript-, term-, and course-level 
ui*til all data for a transcript had been entered. Operator access to any 
level of the data hierarchy for revision, editing, and the like was made 
possible through selection menus. 

Exhibits 3-2 through 3-10 illustrate entry screens that prompted the 
operators for entry of data at transcript- level, term-level, and course- 
level. 

The CADE program enforced a set of predetermined range and value 
limitations on each field, making it impossible for CADE operators to enter, 
for example, an illegitimate school ID (FICE code/vendor number), student 
ID, or combination of the two. The program allowed entry of only the 79 
predetermined CIP codes at the transcript- level (major and minor) and 
course -level. Similarly, grades and credit values entered had to fall 
within specified ranges. 

The most difficult aspect of transcript coding is classifying the 
fields of study and formal course titles using the CIP taxonomy. The CADE 
operators were issued coding manuals that included CIP category dimensions, 
as well as course catalogs and other resource materials relevant to 
transcript coding. To supplement conventional uses of the CIP manual, the 
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ROOTMENOJ 



SCHL 


School (N) 


STUD 


Students (N) 


NOMAD 


Nomad (P) 


LISTER 


Report Generator (P) 


STATS 


Xtab Reports (P) 


UPLOAD 


Cade Upload Proc (P) 


EXIT 


EXrr this project (N) 



<ESC> to EXIT 

. ) 

Exhibit 3-2 " The initial CADE screen depicting data processing options. 

For entry of a transcript record, the CADE selects the "SCHL" 
option (highlighted). 



School NODE 



INTRFACE 
BROWSE 
SCID 
EXIT 
<ESC> to EXIT 



Transcripts CADE (P) 
School R.C. (P) 
Transcripts R.C. (I) 
Exit to Root level (N) 



Exhibit 3-3 CADE operator selects option TNTRFACE" to begin 
entry of a transcript record. 



School ID: 001005 
Student ED: 10707058 



Exhibit 3-4 Operator enters valid school ID (FICE) and student 
ID combination. 



School 001005 Student 10707058 Disp:0 Id 99 count 0 at 11:21AM 



CADE main menu~select function 


1 




11 


ENTER 


11 




11 


EDIT 


11 




11 


VIEW 


11 




11 


VERIFY 


11 




11 


QUIT 


11 




11 


111111111111111111111111111111111111 



Exhibit 3-5 Main CADE options menu. The CADE operator selects 

the option necessary for processing data. Notice that screen 
includes the school, student, cunent disposition of record, 
CADE operator ID number, count (Le., total number of 
terms and courses in the record), and current system time. 
In this case the CADE operator selects "ENTER", to enter 
a transcript record. 
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School 001005 Student 10707058 



Id 99 count 0 



at 11:21AM 



11111111111111111111111 Enter degree information llllllllllllllllllin 



Kind 
2 



lext GPA. Month Year 

BS 4.3 12 86 



TeXtnf Mflior Op l ext nt Minor Cm 

RUSSIAN HISTORY 67 COMPARAn^LTTERATURE 4f 

r 

llllllllllESC = quitllllllFl = helpllllF2=showCIPnumberslllllllll 



Text nf Minor 



Kind 
GPA 
Year 
Month 



^- Bachelor 3 = Master 4 = PhD 5 = License 6 = Cert. 7 = None 
0.0-4.0, or 9.9 = missing 
81-87, or 99= missing 
1-12, or 99= missing 



Exhibit 3-6 CADE prompts the operator to enter degree-related information. 
Stnkmg the Fl function key produces a listing of all valid 
codes for each variable at the transcript level. 
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School 001005 Student 10707058 Id 99 count 0 at 11:23 AM 



= CADE entry menu-select function = 



Add Term 


(to end) 


Add Course 


(to end) 


Insert Course 




Quit 





Exhibit 3-7 CADE operator selects "Add Term" for entry of first term 
appearing on transcript 



School 001005 Student 10707058 Id 99 count 0 at 11:24AM 
Enter term information . 

ICZm Transfer Oradft Sra1.». Term Type Season Start Year 
01000 1 1 2 1 81 

ESC « quit Fl = help ^ 



Transfer: 1 = transfer term 2= regular term 
Scale : 1 = letter 2 « numeric 0-100 3 « numeric 0-4 8 -missing 
Type : 1 = variable/non-course 2 = semester 3 = trimester 4 = quarter 5 = test 
9 g miss 

Season : 1 « fall 2 « winter 3 » spring 4 =s summer 5 « no season 9= unknown 
Year : 81-87 or 99* missing 



Exhibit 3-8 Term data are entered into transcript record. The CADE 
operator summons a "help" list of aU valid, term-level 
codes and labels by pressmg the Fl key (bold). 
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School 001005 Student .10707058 



Id 99 count 0 



at 11:25AM 



= CADE entry menu-select function = 



Add Term 
Add Course 



Insert Course 



(to end) 
(to end) 



Quit 



Exhibit 3-9 The CADE operator now selects "Add Course", to enter the 
first course for Term 1. 



School 001005 Student 10707058 Id 99 count 0 at 11:25AM 

11111111111111111111111 Enter course information lllllllllllllllllliiin 

CouEfi fiojk. Crciiiis Course Title J^p U 

01001 A 3.0 CONTEMPORARY POETOY 41 ii 

lllllllllESC=quiteditlllllFl=helplllllllliF2=showCIPnumberslllll 



Tenn.-01000 Transfenl Scale:l 1^1 Season:! Start: /81 End: / 



Grade : 
Credits: 
Cip : 



n° ^te! i^' °^ S.U,P.W,WP.WF.I,IP.IF.CR^U^O.M 

0- 999.999(999.998 » await supervisor edit/delete) 

1- 78, or 95 *■ uncodable, or 96 ° none/not applicable 



Exhibit 3-10 Data corresponding to the first course in Term OlOOO (1) 
are entered bv the operator. The screen includes a view of 
the term-level data ah-eady entered (bold). 
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CADE program included a computerized version of it. When this feature was 
activated, coders were able to obtain a screen display of the CIP codes and 
their definitions. 



3.3.3 CADE Operator Training 

The CADE operator staff was given six days of intensive training, 
which included formal classroom instruction and independent coding practice 
and drill. Each day's training lasted a full eight hours, because of the 
novelty of the coding/data entry technique employed and the complexity of 
the task. The benefit of the training investment was immediately apparent 
in the high quality of the coding work (both initially and throughout the 
period of activity) and the exceptionally low turnover rates for the coding 
staff. It was also reflected in 

the completion of the coding task 16 days ahead of schedule. 
CADE operator training addressed the following topics: 

■ Hierarchical organization of transcript data 

■ Analysis of transcript-document formats (special emphasis on 
' documents received from non-HEGIS institutions) 

■ CIP codes, dimensions of instructional program categories 

■ Oper^ition of CADE using the IBM-compatible Corona PCs 

■ Progressive, skills- improvement drills at the PC 

■ Individual exercises with mock transcript coding 



CADE operator trainees revievred sample transcripts from a wide variety 
of school types: HEGIS and non-HEGIS, 2»year and A-year, private and public. 
Drills > designed to increase coder identification of in^scope data, were 
conducted daily with excellent results. A major component of classroom 
training addressed the logic of the instructional program category 
dimensions and the CIP codes. 

3. A Data Quality Management 

Quality control of transcript record data was introduced and maintained 
through a combination of procedures: error prevention features within the 
CADE program, verification re-entry of transcripts, supervisor analysis of 
course»file records, supervisor review of entire transcript records, and the 
continual availability of coding supervisors for consultation and guidance. 

The CADE program itself screened for error in three ways. Through a 
check»digit system, the program disallowed entry of incorrect identification 
data (i.e., school FICE codes, student ID numbers, and combinations of 
schools and students). Furthermore, each data field was programmed to 
disallow entry of illogical or otherwise incorrect data. For example, a 
coder was automatically prevented from entering a letter grade for a course 
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if a numerical grading system had been specified on the term-level records 
under which the courses were listed. 

Ten percent of each CADE operator's output was subject to verification 
re-entry by a trusted, specially trained verifier. The verifier was chosen 
by the supervisor to re-enter selected cases and note patterns of 
discrepancy in coding. The verification procedure enabled management to 
better assess the degree of agreement among coders; Verifier re-entry of 
transcripts involved 886 transcripts, or 13.5 percent of the transcripts 
processed. Of the 886 re-entered transcripts, the verifier found at least 
one disagreement in 565 cases, the majority of these occurring in the first 
three weeks. 

All terms and courses were assigned to 1 of 16 course-f iles , to await 
eventual mainframe upload. A special report utility in CADE allowed 
management to dump all terms and courses stored in a particular course- file 
for critical examination. Where problems were observed, for example, in a 
specific category of courses, a more detailed report could be produced that 
showed only those courses corresponding to one or several CIP categories. 
Course -file analysis led to several important updates to the CADE operators' 
manual . 

The CADE shop supervisor analyzed some 18,000 courses over a 14 -week 
period, 10,000 of which were coded during the first and second weeks of 
production. 

One supervisor critically reviewed 649 randomly selected transcripts (9 
percent of transcripts processed) . A supervisor submitted weekly reports to 
m'^nagement detailing error rates for each variable in each hierarchy. The 
rate of error was calculated by dividing the number of times a given 
variable was coded (i.e., "chances") by the sum of errant coding decisions. 
The rate of error calculated for the two variables deemed most critical, 
major field of study CIP and course CIP, was 5.3 percent (major field of 
study) and 3.8 percent (course CIP). 

As part of quality control, supervisors also reviewed screens of 
transcript records. These screens included the user- file ID of the CADE 
operator who entered the record, allowing the supervisor to make individual 
assessments and thus provide personal feedback to staff. 

As unanticipated problems arose during the CADE period, a policy 
decisions protocol was followed. All questions and other issues were 
directed to project management staff for assessment and final coding 
decision. The resulting decisions were routinely distributed to the CADE 
operators, to be added to their coding manuals. 

4. DATA PROCESSING 

Data processing activities began with the construction of the subsample of 
postsecondary attenders from the main survey files, and the creation of lists of 
institutions from which transcripts were to be requested. They continued with 
the development of programs and materials to request transcripts and to monitor 
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data collection activities, and with the adaptation of NORC's Computer Assisted 
Data Entry (CADE) system for the abstraction and coding of transcript 
information. These activities have been described in Chapters 2 and 3 of this 
manual. Once transcript data was converted to machine-readable form, the data • 
was restructured into a set of four rectangular data files for efficient storage. 
It was then uploaded from microcomputers to mainframe facilities, and further 
processing included computer editing of the data, and the creation of sets of 
program control files to permit the construction of analysis files using cither 
SAS or SPSS, the two most commonly-used statistical packages for analyzing NELS 
data sfcts. Finally, two sets of adjusted sampling weights were created for 
making population estimates with transcript and other survey data. This chapter 
describes the activities from machine editing through data file construction. 
Sampling and weighing are the subject of Chapter 5, 

4.1 Machine Editing 

As described in Chapter 3 above, the CADE program was designed with 
extensive controls on data entry, resulting in very low error rates for all 
elements in the raw data. The computer editing strategy was guided by the same 
principles as the CADE design process- -that is, a highly flexible approach was 
necessary to accommodate the tremendous variation in format and quality of 
transcripts. 

To begin with, a thorough analysis was made of the distribution of values 
for each separate item in the raw data files. The purpose of this check was to 
identify data, values that, based on knowledge of and experience with transcript 
data, appeared to be errors. Because of the extensive "front-end" cleaning 
performed by the CADE program, the bulk of the raw data items appeared to have 
very few errors, with the average cr'ror rate less than one half of one percent. 
In most instances, stray codes and illegal values were the results of specific 
keying errors that could not be prevented in a cost-effective manner by the CADE 
program, 

4.2 Organization and Content of the Data File 

The CADE program processed data at three levels described in Chapter 3: 
transcript, term, and course. The design of the final data files called for an 
additional data level, the student level, under which all transcript data for 
each sample member would be ordered. The student record was formed by 
aggregating all records for an individual student, merging data from the sampling 
and receipt control files, computing a series of composite variables based on 
data from all of a student's transcripts, and finally merging in a set of 
composite variables from the main HS&B third follow-up survey data tapes. 

In designing the final transcripts database, data storage efficiency was a 
major consideration. A standard rectangular file organization was ruled out 
because the amount of space required to handle the maximum record length for 
every case would have been impractical. Further, because the amount of data 
stored for each case was extremely variable (most cases for 3-4 year schools had 
an average of 1 transcript, about 8 terras, and about 31 courses, but some cases 
had 3 times this amount of data), a flat file structure would have been populated 
with empty data fields for most cases. Vocational schools averaged 1 term. 
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To optimize storage space for the vast amount of information in the 
transcript study database, each of the four record types (student, transcript, 
terra, and course records) was written to a separate file. Analysts may use the 
four files individually or jointly, depending upon their specific research 
objectives. For many analyses, researchers may find it sufficient to use only 
the composite variables from the questionnaire and transcript files stored on the 
student- level records. For other purposes, merging the student- and transcript- 
level records will provide the amount of detail desired. However, for most 
studies, it will be necessary to merge all four files into a single hierarchical 
file in which courses are nested within terms, terms within transcripts, and 
transcripts within students. Once this merged file is created, analysts may 
construct any number of composite indicators of educational activity, and then 
reduce the data matrix to keep only the variables essential to the analysis in a 
rectangular file with one record per student. Managing the data in this way will 
reduce storage costs for online data sets and will minimize the computing (CPU) 
time necessary to obtain results. • 

The student and transcript record data files contain information for 6,098 
survey respondents in the transcript survey sample. Each member of the sample 
(including the -'ineligible" cases described in Chapters 2 and 5) has a student- 
level record in the file. A transcript record was created for each requested 
transcript, even if the transcript was not returned, or if school officials 
reported that a student had never actually enrolled. Cases for whom transcripts 
were requested but not received have "dummy" transcript records in the file. 

On each transcript record is a disposition (status) code showing either that 
the transcript was leceived and processed, and that term and course records exist 
in the appropriate files, or showing the reason (if known) that the transcript 
was not received (e.g., school had closed, records lost or unavailable). For 
cases (transcripts) defined as out-of-scope (see Chapters 2 and 5), this 
disposition field contains the code indicating that the school reported that the 
student never attended the named school and that no transcript exists. 
Researchers should note that any given sample member may have a combination of 
transcript records classified as "received and processed", "out-o^-scbpe" , and 
"not received, but in scope" associated with his or her student- level record. 
(For conventional analyses of these data using the adjusted weights attached to 
these files, it is strongly recommended that analysts first purge the files, 
including ineligibles and eligible nonrespondents , of those with no transcripts.) 

Associated with each "received** transcript are one or more Term Records 
containing data for each of the terms reported. (No term or course records were 
created for cases for which no transcripts were received.) Separate course 
records were created for each unique course taken within a term, including failed 
and audited courses. 

4.2.1- The Student Record 

As noted above, a student- level record was included in the database for 
every sample member for whom a transcript was requested, including those who 
later proved to be ineligible (never attended) , or for whom no transcripts were 
received. Student- level records contain identifying and survey control data, 
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activity state pointers, composite variables from the main HS&B files, and 
weights . 

4.2.2 The Transcript Record 

One transcript- level record was created for each transcript requested for 
each sample member. There is at least one Transcript Record for every student; 
over one- third of HS&B sample members have multiple Transcript Records. For 
ineligible cases » or for eligible cases for whom a transcript was not received, 
the transcript- level record is a placeholder or dummy record where information 
about the transcript request (e.g., the institution's ID number, the final data 
collection status code, etc.) is stored. If a transcript was received and 
processed, the transcript record stores information related to the entire period 
of attendance at the school, such as degree received, grade point average^ 
whether the school accepted any transfer credits, and so forth. Information 
related to specific terms of attendance or specific courses taken is stored on 
term- or course-level records, which may be linked by a combination of ID keys to 
the transcript record with which they were originally associated. There are a. 
total of 7,429 Transcript Records in the HSfieB sophomore cohort database. 

A total of 443 (never attended/plus duplicates) transcript- level records 
exist for out- of- scope cases. These records should be omitted from conventional 
analyses. Although raw weights have been included for these cases to permit the 
calculation of additional curtomized weights, the adjusted weights for ineligible 
cases is always set to the value I'zero" (see Chapter 5). 

4.2.3 The Term Record 

A Term Record was created to store data for each term associated with a 
transcript, and to provide an organizing mechanism for linking course-level 
records associated with a given term and transcript. Students have widely 
varying numbers of Term Records (up to 22 terms), reflecting the amount of time 
spent in postsecondary schooling. Students who enrolled only in one short-term 
vocational program, or who stayed for only one semester at an academic 
institution; may have only a single Term Record in the file. Approximately 10 
percent of the 6,536 coded transcripts had a single associated Term Record. 
Students continuously enrolled in institutions of higher education since high 
school graduation have many more Term Records in the database. The HS&B 
sophomore cohort database includes a total of 43,592 Term Records (covering the 
5,533 students for whom one or more transcripts were received). Approximately 
half (51 percent) of the transcripts in the file are linked to four or fewer 
terms. An additional 32 percent of the transcripts are linked to between 5 and 8 
Term Records. Eight percent of the transcripts are linked to more than 10 Term 
Records . 

Most Term Records describe conventional academic terms of study such as 
semesters or quarters. These Term Recorcs store data that pertain to courses 
taken during the specified term, and which otherwise would have been repetitively 
and wastefully stored directly on Course Records. Term Records include such 
items as beginning and ending dates for the term and the grade sc^le being used 
for the courses taken in that term. In some cases, grading schemes at a school 
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changed during a student's period of attendance. Data on terra records help to 
identify these instances for proper handling. 

4.2.4 The Course Record 

One Course Record was created for every course reported on a transcript. 
Credit-bearing entities other than courses were also stored in course records 
(e.g., credits earned through work experience or by exaraination) . Varying 
numbers of Course Records are associated with each term for a particular student. 
In all, there are 194,672 Course Records stored in the HS&B sophomore cohort 
database. 

4.3 Merging Records 

As described above, the postsecondary transcript database consists of four 
files, one for each record t3rpe. However, the individual record types have been 
designed to allow for the merging of data from two of raore files into a single 
hierarchical file, or, if necessary, into a very large rectangular file. The 
relationship among the various record types and the identifiers needed to raerge 
levels are summarized in Exhibit 4-1. 

4.4 A Cautionary Note on the Use of Credits and Grades Data in the 
Postsecondary Transcripts Database 

As we have eraphasized throughout this report, postsecondary transcript data 
were abstracted frora school records of greatly varying structure and content. It 
is essential for researchers using these data to be fully aware that the 
eleraents in the database are intended to be a faithful reproduction of the 
information reported on the transcripts. Except for the creation of liraited 
coraposite variables, the transcript data have not been rescaled, standardized, or 
otherwise manipulated prior to entry into the database. For some iteras, notably 
course grades, school-reported grade point averages, and course credits, the 
researcher must not assxime that the data stored in the designated fields are all 
values frora a common underlying metric. 

Course grades were entered as they appeared on the transcript. Two types of 
grades (letter and numeric) were stored on separate fields in the course records 
in order to minimize the effort needed to compute customized grade indicators. 

As explained above, a comprehensive list of allowable letter grades 
(including such administrative "grades" as "credit given," "audit," "withdrawal," 
"pass," and "fail") was constructed to handle the entry of letter grades reported 
by schools. Although nearly all (97.3 percent) of the schools assigned letter 
grades, not all schools used all possible grades to make distinctions between 
student performance levels. Although most schools used conventional "+" and "-" 
qualifiers, sorae scliools applied these only to selected levels (e.g., C+, C, and 
C-, but not B+ or B-). More iraportant, however, is the fact that several 
different scheraes of ntuneric equivalents were used by schools in translating 
letter grades to ntunber grades for the computation of grade point averages. By 
far the most common scheme is the standard four-point collegiate scale (A - 4, 
B-3,C-2,D-l,EorF-0). A small ntunber of schools assigned different 
ntjuneric equivalents, however, such as setting the value of an "A" grade to 5 or 6 
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Exhibit 4-1 

A Schematic Diagram of the Database Hierarchy 
Representing Nested Transcript » Term, and Course Records 
for Three Sampled Students 



Student Level [ STUID | | 



Transcript Level I STUID [TRANSNUM | | 



Term Level I STUID [TRANSNUM |TERMNUM 



Course Level I STUID i TRANSNUM ITERMNUM 



Course Level I STUID 1 TRANSNUM ITERMNUM 



Term Level I STUID I TRANSNUM [TERMNUM | | 



Course Level I STUID [TRANSNUM [TERMNUM | 



Transcript Level I STUID [TRANSNUM | [ 



Term Level I STUID [TRANSNUM [TERMNUM | [ 



Course Level [ STUID [TRANSNUM [TERMNUM 



Student Level I STUID \ 



Transcript Level I STUID I TRANSNUM I [ 



Term Level [ STUID [TRANSNUM [TERMNUM I [ 



Course Level I STUID [TRANSNUM [TERMNUM [ 



Student Level [ STUID [ \ [ 



Transcript Level I STUID [TRANSNUM 



Term Level I STUID [TRANSNUM [TERMNUM | [ 



Course Level I STUID [TRANSNUM [TERMNUM I | 



Term Level [ STUID 1 TRANSNUM [TERMNUM [ [ 



Course Level I STUID [ TRANSNUM [TERMNUM I [ 



Transcript Level I STUID [TRANSNUM [ [ 



Term Level . [ STUID [TR/J4SNUM [TERMNUM | [ 



Course Level I STUID [TRANSNUM [TERMNUM I | 
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numeric points. For this reason, some grade point averages in the fields on 
the transcript -level records may exceed the conventional upper bound of 4.0. 

Less that 3 percent of all courses in the file were graded on a numeric 
scale. These courses were disproportionately found on transcripts from short- 
term vocational/proprietary school programs. To help establish a basis for 
standardizing the metric for the numeric grades, the teirm records contain fields 
showing the highest, lowest, and minimum passing scores for the designated ^ 
school's grading system if this information was present on the transcript or in 
other documentation (bulletins or course catalogs) from' the school. 

The data in the course credits field also were entered exactly as reported 
on the transcript form, with no attempt made to standardize the units. 
Researchers should use special caution in analyzing and further manipulating 
course credit data. At a minimum, researchers should familiarize themselves with 
the variability of data in the fields prior to conducting analyses. We further 
recommend that researchers carefully examine the ranges and distributions of 
credit values reported by different types of institutions. For the most part, 
standard collegiate institutions reported credits based on the same or very 
closely related credit scales. At these institutions, the typical academic 
course in most departments carried a value of 3 credits, and so this is the modal 
value observed for courses at these institutions. (In fact, 5l percent of all 
courses taken by HS&B sample members carried exactly 3 credits.) A significant 
proportion of courses, especially those in the hard sciences that included 
extensions such as laboratory periods and other additions to standard classroom 
schedules, earned higher credit values, although the majority of the values fall 
between 3 and 5 credits for these expanded courses. Lower- level courses whose 
classes met for fewer hours per week had credit values between 1 and 3 (about 24 
percent of all courses taken). 

Courses with credit values greater than 5 were rare, (about 2 to 3 T^ercent of 
all courses for which credits could be coded). Altogether, 95 percent of the 
courses taken by sample members carried between 0 and 5 credits (about 5.8 
percent of the courses carried no credit) . Courses with credit values between 5 
and 20 accounted for an additional 1.5 percent of all courses taken by HS&B 
cases. Credit values greater than 20 and up to the allowable limit of 999.997 
(almost exclusively from vocational programs reflecting clock-hour systems) 
accounted for less than 1 percent of those recorded. 

For most conventional analyses, researchers may wish to record credit 
values above, for example, 5 or 8 credits, to the missing data code in order to 
prevent unusual programs with extreme values from affecting results. 
Researchers who are especially interested in vocational programs and courses 
should carefully examine all of the data related to courses with high credit 
values . 

Of concern in any analysis of course credits is the possibility of 
differences between the numeric scales for credits awarded by schools on the 
semester system and schools on the quarter system. These two types of term 
systems accounted for about 79 percent of all terms in which sampled students 
were enrolled. Trimesters accounted for 16 percent of the 43,592 term records. 
There were fewer other types of terms in the transcripts. Variable- length terms, 
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conunon at vocational schools, accounted for an average of 1 percent of all terms 
reported. Semesters, on the other hand, accounted for 65 percent of the terms; 
quarters accounted for 13.5 percent. 

Typically, the number of credits required for graduation from schools on the 
quarter system is slightly higher than the number required by schools on the 
semester system. This gives rise to the concern that course credits may not be 
expressed in comparable units across types of institutions, and that the value of 
a course given at a quarter system school may have •'inflated" credit value, 
compared to the credit value of the same courses at a school on the semester 
system. Some researchers have suggested that the transcripts data file include 
additional fields containing rescaled or standardized credits;, to ensure that 
credits from differing systems were scored on a common metric. A frequent 
suggestion has been that course credits for schools on the quarter system be 
deflated by a linear transformation in order to more nearly equal those awarded 
by semester system schools. 

Although the Postsecondary Education Transcript Study did not include the 
resources for a formal study of this issue, a number of empirical analyses 
demonstrated that the fa'^t of comparability or non-comparability cannot easily be 
established. The simpliest but most compelling evidence against any simple 
transformation of quarter system course credits came from comparisons of the 
credit values of standard collegiate courses taken by students in both types of 
schools. These comparisons showed clearly that for most typical science, 
mathematics, social science, or humanities courses, the credit values were the 
same (generally 3 credits) at both types of institutions. Further comparisons of 
the average number of credits carried by students per terms showed no systematic 
or significant differences between the two systems. For these reasons, the final 
decision concerning course credits was to include on the public release tapes 
only the raw credit values as they were reported on the transcripts, and to 
caution researchers that the comparability of credits across institution and 
term tyj^es could not be assumed, but should be carefully assessed in light of 
specific analytical objectives. 

Finally, a major source of variation in the credit values in the file 
relates to the use of "clock hours" rather than conventional "credit hours" by 
vocational and proprietary schools. Students at these schools often earned 
several hundred clock hour credits for completing a unified program made up of 
several instructional modules each lasting a few days. Analysts are strongly 
urged to use special caution in the analysis of course credit fields because of 
the extreme effects these outlier values (some ranging as high as 999.997) may 
have on statistical estimates. These values have been retained in the system to 
support special analyses of relatively small subgroupa of students and their 
educational activities. Failure to provide for special handling of these cases 
may produce bizarre results in conventional analyses. 
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5. SAMPLE DESIGN AND IMPLEMENTATION 



The Sophomore Postsecondary Education Transcripts Study involved the 
collection and processing of school records for a subsaraple of the High School 
and Beyond (HS&B) 1980 sophomore cohort. A full description of the sample 
design for HS&B is provided in the sample design reports for the base year and 
first, second, and third follow-up surveys.^ The following sections present an 
overview of the sample design for the full survey. 

5.1 Base Year Sample Design 

The base year (1980) survey employed a two-stage, highly stratified sample 
design with secondary schools having tenth and/or twelfth grades as the 
first- stage units of selection and students within schools as the second- stage 
units. With the exception of certain special strata, which were oversampled, 
schools were selected with probabilities proportional to their estimated 
enrollment in the tenth and twelfth grades. Within each school, 36 seniors and 
36 sophomores were randomly selected. In schools with fewer than 36 seniors or 
36 sophomores, all eligible students were selected. Sampling rates were set so 
as to select within each stratum the number of schools needed to satisfy study 
design criteria regarding minimum sample sizes for certain types of schools. As 
a result, some schools had a very high probability of inclusion in the sample (in 
some cases equal to 1,0) while others had a much lower probability of inclusion. 
The total number of schools selected for the initial sample was 1,122, from a 
frame of 24,725 schools with grades ten or twelve or both.^ Sampling strata and 
the number of schools selected in each are shown in Table 5-1. 

Substitutions were made for schools that refused to participate in the 
survey. No substitutions were made, however, for students who for. whatever 
reason failed to participate.^ Substitutions for refusal schools occurred only 
within strata. In certain cases no substitution was possible because a school 
was the sole member of its stratum. (See the High School and Bevond Third 
Follow-U p Sample Design Report , which is available from NCES.) 

The realization of the sample by stratum is shown in Table 5-2. Although 
the sample design specified that students in all but the special strata would be 
selected with approximately equal probabilities, the probabilities are only 
roughly equal. In the special strata, students were selected with higher 
probabilities- -in some instances, extremely high probabilities. Moreover, the 
sample as realized did not equal the sample as drawn, creating further deviations 
from a self -weighing sample. Consequently, each school (and student) was 
assigned a weight equal to the number of schools (or students) in the universes 
they represented. Since each student's overall selection probability (hence 
weight) was further influenced by the sample design, the derivation of student 
case weights is discussed below. Calculation of school weights is described in 
the users' manual for the school questionnaire data file. 
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Table 5-1 

High School and Beyond Base Year School Sample Selection 



Special strata (oversampled) 

Number 



Alternative public 50 

Cuban public 20* 

Cuban Catholic 10* 

Other Hispanic public 106* 

High performance private 12 
Other non-Catholic private (stratified by 

four Census regions) 38 

Black Catholic 30* 

Regular strata (not oversampled) 

Regular Catholic (stratified by four Census regions) A8 
Regular public (stratified by nine Census divisions; 
racial composition; enrollment; 

central-city » suburban, rural) 808 



1.122 



*These schools were defined as those having 30 percent or more of enrollment 
from the indicated subgroup. 
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Table 5-2 

High School and Beyond Base Year Sample Realization 



Stage 1: Sampling of schools 



Drawn in Original Substituted Total 
Stratum sample schools* schools realized 



Regular public 


808 




585 


150 


735 


Alternative public 


^ 50 




41 


4 


45 


Cuban public 


20 




11 




11 


Other Hispanic public 106 


• 


72 


30 


102 


Regular Catholic 


48 




40 


5 


45 


Black Catholic 


30 




23 


-* 

7 


30 


Cuban Catholic 


10 




7 


2 


9 


High performance private 12 




9 


2 


11 


Other non-Catholic private 38 




23 


4 


27 


TOTAL 


1,122 




811 


204 


1,015 




Stape 2: 


Sampling 


of students 






Total 


Absent, both 






Partial 




drawn in 


Survey and 


Student 


Parent 


materials 


Total 


sample 


Make -Up Days 


refused 


refused 


missing** 


realized 


Number 70,704 


8,278 


1,759 


233 


2,174 


58,270 


Percent 100 


12 


3 




3 


82 



*Includes additional selections made when schools were found to be out-of -scope. 
**Unusable because critical survey materials missing. 



Use of appropriate weights should lead to correct estimates (within sampling 
error) of the population of tenth and twelfth grade students in United States 
schools in spring 1980, and of subgroups within that population. 

5.2 1980 Sophomore Cohort Sample Design for Second and Third Follow-Up Surveys 

The sample design for the 1980 sophomore cohort was based on the high school 
transcript study conducted between the first and second follow-ups. During the 
fall of 1982, high school transcripts were sought for a probability subsample of 
nearly 18,500 members of the 19,80 sophomore cohort. The subsampling plan for the 
transcript study emphasized the retention of members of subgroups of special 
relevance for education policy analysis. Compared to the base year and first 
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. follow-up, the transcript study sample design further increased the over 
representation of racial and ethnic minorities (especially those with above- 
average HS&B achievement test scores), students who attended private high 
schools, school dropouts, transfers and early . graduates , and students whose 
parents participated in the base year parent survey on financing postsecondary 
education. 

Transcripts were collected and processed for nearly 16,000 members of the 
sophomore cohort. A public use data file containing transcript information is 
available from NCES. Transcript data can be merged easily with student 
questionnaire data files using the case identification numbers common to the two 
files. The Data File Users Manual for the HS&B High School Transcripts Study 
(also available from NCES) contains a full description of the sample design and 
other features of the transcript study. 

The sample for the second follow-up survey of the 1980 sophomore cohort was 
composed of approximately 15,000 cases selected from among the 18,500 retained 
for the transcript study. Like the second follow-up sample for the senior 
cohort, the sample for the sophomore cohort includes disproportionate numbers of 
persons from policy-relevant subpopulations--for example, racial and ethnic 
minorities, students from private high schools, high school dropouts, and 
students who planned to pursue some type of postsecondary schooling. The sample 
for the third follow-up survey was identical to that of the second follow-up. 
The second/third follow-up sample, though much smaller than the base year and 
first follow-up samples, is thus- ahle to provide estimates for many 
subpopulations that are nearly as precise, statistically, as those of the larger 
samples. The second and third follow-up sample allocation is shown below in 
Table 5-3. For further details see the High School and Beyond Second Follow>Up 
Sample Design Report, by C. Jones and B. Spencer (NORC, 1984). The base year and 
first follow-up sample report is available from NCES. 

5.3 The Senior Cohort Postsecondary Education Transcript Study (PETS) Sample 

In 1984, postsecondary transcripts were requested for all members of the 
1980 senior cohort^who reported in either the first or second follow-up survey 
attending any^ fonii of postsecondary school since leaving high school. Thus, no 
further probabilistic sampling was done to define the PETS sample. The only 
restriction on inclusion in the PETS sample was that the respondent must have 
provided the name of the school attended, so that records could be requested. 
Thus, omitted from the transcript study were a very few sample members who 
indicated that they had attend<*d some form of postsecondary school, but who gave 
no indication during either follow-up survey of the name of the school(s). In 
all, 7,776 members of the 1^80 senior cohort satisfied l initial criteria for 
inclusion by -naming at least one school in at least one of the follow-up surveys. 

5.4 The Sophomore Cohort Postsecondary Education Transcript Study Sample 

In order to conserve resources, a somewhat more restrictive sample was drawn 
for the HS&B sophomore cohort than was drawn for the senior cohort. The 
Department of Education was primarily interested in learning about the HS&B 
sample members who exhibited a *»normal'* pattern of postsecondary school 
attendance. Therefore, it was decided at the outset that those students who 
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Table 5-3 



1980 Sophomore Cohort Second Follow-Up Sample 
Distribution by Race- Ethnicity Typology 



Population size Second follow-up 



Student Status 
Category 



N 



% of 
Total 



% of 
Total 



Hispanic 

Cuban/Puerto Rican 89,674 

High achievemeiit 85,762 

Other Hispanic * 299,802 

Asian/Pacific 

Islander 46,835 

Native American 48,418 

Black 

High achievement 84,544 

Other 375,185 

High Achievement/ 

Low-SES whites 69,759 

All others 2,679,309 



2.4% 
2.3% 
7.9% 



1.2% 
1.3% 



2.2% 
9.9% 



1.8% 
70.9% 



990 
886 

1,375* 



431 
291 



741 

1,295 



388 

8,428 



6.7% 
6.0% 
9.3% 



2.9% 
2.0% 



5.0% 
8.7% 



2.6% 
56.8% 



TOTAL 



3,779,288 



100.0% 



14,825 



100.0% 



NOTE: For this typology, sample members were assigned to ethnic or racial 

categories on a sequential or hierarchical basis. That is, individuals 
who reported Cuban or Puerto Rican origin or descent in either the base 
year or first follow-up were so classified in this typology. High- 
achievement Hispanics were then classified among the remaining 
non-Cuban/non- Puerto Rican cases. (Since some Cubans and Puerto Ricans 
were also "high achievement," the total number of high- achievement 
Hispanics is larger than shown in this table. "Other Hispanics" were 
then classified from among all remaining cases not assigned to the two 
previous categories. This procedure was repeated sequentially for each 
remaining category in-;the table. The result is a distribution of 
mutually exclusive categories whose contents sum to the population or 
sample size. The distributions presented mask considerable overlap 
among groups within the sample (e.g.. Blacks who are also Hispanic). 
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entered postsecondary school in the fall inunediately following their high school 
graduation would be drawn into the sample. With the exception of vocational 
students, students who delayed their postsecondary school until the winter of 
1983 or later .were not included in the sample. 

No probabilistic sampling was undertaken; rather, students who were 
considered of greatest policy interest were selected into the sample with 
certainty. More specifically, the sample was selected in two steps. First, 
students exhibiting certain attendance patterns were selected, and second, the 
schools they attended were selected. Students were selected into the sample on 
the basis of their responses to second follow-up (1984) and third follow-up 
(1986) questions on schools attended after leaving high school. 

Under Step 1, students defined as normal persisters were drawn into the 
sample with certainty. Normal persisters were students who began attending any 
postsecondary school (with the exception of foreign schools) full-time by October 
1982 and did not leave the school until after August of 1982, This definition 
removes students who attended school during the s\imraer only in 1982, Normal 
persisters attended any of six types of schools: proprietary, private technical 
or two-year, public technical, two-year college or university, four-year public 
university, and four-year private college or university. 

Next, vocational students were drawn into the sample. These students were 
not normal persisters and started attending a proprietary school, private 
technical or two-year school, or public technical school and did not leave until 
after August 1982, Again, this definition eliminates students who studied in the 
summer only, . Vocational students were included even if they were attending 
school part-time. 

Under Step 2, the schools were selected. Because a certain proportion of 
students transfer from their first school to other schools, there are necessarily 
more transcripts than students. In fact, the sample that results from the two- 
step selection process is a sample of student-school combinations for which 
transcript information is collected. 

No attempt was made to request transfers from all schools attended by the 
sample of students. Transcripts were selected from second and third schools only 
if they represented a pattern of normal progression through postsecondary school. 
The schools were selected as follows: 

■ If a student was a normal persister and started attending a two-year 
public, four-year private, or four-year public college or university, 
this school was selected, 

■ Any other four-year private or four-year public institution was 
selected if, after attending the first school, the student began 
attending this school as a full-time student, 

■ Any two-year public university was selected if, after attending the 
first school, the student attended this school and also attended 
another four-year private or four-year public university. 
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■ If the student was a normal persister and began attending a 
proprietary, private technical, or two-year school, or public technical 
school, this school was selected. 

■ If a student was a vocational student, then the first school was 
selected for this student. If a vocational student began attending a 
second vocational school, this school was not selected. 

A total of 6,098 students and 2,139 schools were selected into the sample. 
Table 5-4 shows the distribution of students and transcripts. 

HowAver, there were 565 students for whom transcripts were not received. 
There: were a variety of reasons given for not sending transcripts: school > 
refused to release transcripts; schools refused to cooperate with the transcript 
study; transcripts were lost or destroyed; schools closed; and there was no 
response from the s^ool. In addition, there were some students whom school 
officials claimed had never enrolled or did not complete sufficient work to have 
an enrollment record. * * 

Because the evidence for non-attendance is not completely conclusive for the 
students who reportedly ••never attended" or any of the rest of the 565 cases, 
these students have been included on the public release data files (including raw 
weights and selected HS&B questionnaire data). These cases also have a lingle 
dummy Transcript Record whose Final Disposition field indicates the reason for no 
response. In the course of normal transcript d^^ta analysis, these cases may be 
deleted from the analysis data files by selecting for the analysis only cases 
with non-zero values for one 'of the transcript weights. 



Table 5-4 



High School and Beyond Sophomore Postsecondary Transcript Sample 



Student Group 


Students 


Transcripts 


Normal persisters in public 2-year, 
private 4 -year, and public 4-year 


5,122 


6,453 


Normal persisters in proprietary 
private technical 2 -year, or public 
technical school 


572 


572 


Vocational students 


404 


404 


TOTAL 


6,098 


7.429 
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Tablte 5-5 shows the distribution of the number of schools reported by 
students who were considered in-scope. The only students considered out of scope 
were those 229 who reportedly "never attended" the institutions they had named in 
the second or third follow-up survey. The analyst will note that the transcript 
level file. is coded "never attended". Also deleted from this table are 47 
duplicate transcripts. Over three-fifths of the students reported attending only 
one institution in their responses to the follow-up surveys. An additional 30 
percent of these cases reported attending exactly two schools. Only about 8 
percent (602) reported attending three or more postsecondary schools during the 
four-year period since leaving high school. 

5.5 Sample Weights 

The general purpose of weighing survey data is threefold: the weights allow 
data from the sample to be used for estimating population totals; the weights 
compensate for unequal probabilities of selection (or retention) in the survey; 
and the weights adjust for nonresponse in the study. 

The HS&B weights are based on the inverse of the selection probabilities 
through all stages of the sampling process; the nonresponse adjustments are based 
on the inverse of the, response rates within weighing classes. A "raw" weight, 
which reflects only the selection probabilities and which is not adjusted for 
nonresponse, is also calculated and will be included on the data files for the 
Postsecondary Transcript Study. The raw weight allows analysts to construct 
their own adjustments for nonresponse; in addition, the raw weight was used in 
calculating weighted response rates for the purpose of nonresponse adjustment. 



Table 5-5 

Number of Postsecondary Schools Reported by 
Members of the HS&B 1980 Sophomore Cohort 



Number of schools Number of cases Percent 

One 4,606 62.0 % 

Tvo 2,226 29.9 % 

Three or more 602 8.1 % 

TOTAL* 7,434 100.0 % 



*NOTE: An additional 342 cases who reported attending a single school were 
defined as ineligible and are excluded from this table. 
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The weighing procedures for the Postsecondary Education (PSE) Transcripts 
Study involved two major steps: 

Step 1. Calculation of a preliminary, or raw, weight based on the 

inverse of the product of the probabilities of selection for 
the base year sample and retention in the follow-up surveys. 
This new raw weight is simply the follow-up raw weight times 
the inverse of the probability of retention in the PSE 
Transcript sample. 

Step 2. Adjustment of the raw weight to compensate for "unit" 
nonresponse--that is, for nonresponse on an entire 
questionnaire, test, or transcript. (By definition, the new raw 
weight, RAWWT, is unadjusted for nonresponse.) 

For the sophomore cohort, the PSE Transcript Study involved no new 
subsampling beyond what had been carried out for the HS6tB second follow-up; that 
is, all second follow-up cases deemed eligible for the PSE Transcript Study were 
included in the sample. (Relative to the senior cohort, a somewhat more 
restrictive definition of eligibility was used in designating cases for the 
sophomore cohort PSE Transcript sample. The sample consisted mainly' of students 
who enrolled full-time in fall 1982 in an academic institution or who attended a 
vocational technical school any time before July 1986). Thus, the raw weight 
described in Step 1 above is the same as the raw weight for the second (and 
third) follow-up survey. 

Two separate nonresponse adjustments were calculated using the general 
technique described in Step 2. Both sets of nonresponse adjustments apply to all 
6,098 cases selected for the PSE Transcript Study. The first adjustment corrects 
for nonresponse in the Transcript Study itself. For the purpose of this 
adjustment, a case was counted as complete if one on more transcripts were 
obtained for that case; a case was treated as a nonrespondent if no transcripts 
were obtained. The second adjustment corrects for nonresponse in the Transcript 
Study and the four prior surveys (i.e., the base year and three follow-ups). For 
the purpose of this adjustment, a case was counted as complete only if the case 
had at least one transcript and completed questionnaires for all four HS&B survey 
rounds; all other cases were counted as nonrespondents. 

This approach to weighing defines the sample person, rather than the 
individual transcript, as the unit of analysis. The weights apply to the person 
and to all the data associated with that person and are not. intended to be 
applied to individual transcripts. 

Both sets of nonresponse adjustments were computed as simple ratios (sum of 
the raw weights for all cases over the sum of the raw weights for the completed 
cases) within 29 weighing cells. The weighing cells were defined by cross - 
classifying cases according to the type of high school attended, sex, race, and 
the type of postsecondary school attended. These four variables have been 
consistently related to nonresponse in the HS&B studies and were used in defining 
nonresponse adjustment cells for the Senior PSE Transcript Study. The cross - 
classification results in 48 cells; cells with 20 or fewer cases were pooled with 
adjacent cells having similar completion rates. (In a few cases, small cells 
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requiring nonresponse adjustments close to 1,0 were left intact,) After pooling, 
29 cells remained; nonresponse adjustments were calculated for each cell. 

Within each cell» the nonresponse adjustment was obtained by dividing the 
sum of weights for all selected cases by the sum of weights for the "completed" 
cases • The nonresponse adjustment is thus the inverse of the weighted response 
rate. The final adjusted weights are just the product of the adjustment factors 
and the raw weights. Tables 5-6 and 5-7 below present the weighing cells and 
adjustment factors for both sets of PSE Transcript weights. 

If a completed case is defined as one for which at least one transcript was 
obtained, then the weighted completion rate for the PSE Transcripts Study is 90.8 
percent (1,292,191 weighted completes over 1,422,340 eligible; see Table 5-6). 
The average adjustment factor is just the inverse of this completion rate (i-e,, 
1.-10). Similarly, if a completed case is defined as one with at least one 
transcript and questionnaire data from all prior waves of the survey, the 
weighted completion rate is 79.3 percent (see Table 5-7), and the'mean adjustment 
factor is 1.26. 

Relative to the senior cohort PSE Transcript weights, three differences are 
readily apparent. First, the size of the population for the sophomore cohort 
(estimated by the sum of the weights) is smaller than that for the seniors (1.4 
million versus 1.8 million; cf. Tables 5-6 and 5-7 with Tables 5.4-1 and 5.4-2 in 
the High School and Bevond Senior Cohort Postsecondary Education Transcript Study 
Data File Users ^ Manual ) . This difference appears to reflect the more 
restrictive criteria used in defining eligibility for the PSE Transcript Study 
within the sophomore cohort. 

A second difference is that the adjustment factors are somewhat larger for 
the sophomore cohort than for the senior cohort; this reflects the difference in 
response rates. Overall, at least one transcript was obtained from about 94 
percent of the senior cohort sample (versus 91 percent for the sophomore cohort) . 
Similarly, cases with at least one transcript and complete questionnaire data 
from prior rounds constituted 86 percent of the senior cohort PSE Transcript 
sample (versus 79 percent for the sophomore cohort) . 

Finally, we note that for most of the weighing cells involving cases with 
only vocational postsecondary education (rows 23 through 29 in Tables 5-6 and 5- 
7), the estimated population sizes are actually somewhat larger for the sophomore 
cohort than for the senior cohort, despite the reduction in the overall 
population size noted earlier (cf . Tables 5.4-1 and 5.4-2 in the senior cohort 
Data File Users ^ Manual ) , This appears to reflect a real underlying difference 
between the two cohorts and is consistent with other data from the High School 
and Beyond surveys. For example, as of the third follow-up (when cases were 
selected for the sophomore PSE Transcripts Study) » about 15 percent of the 
sophomore cohort reported that they had attended vocational school; the 
corresponding figure for the senior cohort (as of the second follow-up, when 
cases were selected for the senior PSE transcript study) is only 11 percer.r. 
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Table 5-6 



Nonresponse Adjustmencs to Sampling Weights 
for Completed Cases in HS&B Sophomore Cohort 
Postsecondary Education Transcript Study (WTl) 

Weighing classes 



Vocational Type of 



postsecondary 
only 


secondary 
school 


Sex 


Race 


1 


No 


Reg -i?ublic 


M 


Hisp 


2 


No 


Reg Public 


M 


Black 


3 


No , 


Reg Public 


M 


Other 


4 


No . 


Reg Public 


E 


Hisp 


5 


No 


Reg Public 


F 


Black 


6 


No 


Reg Public 


F 


Other 


•7 


No 


Hisp Public. 


M 


Hisp 


8 


No 


Hisp Public 


M 


Black 


9 


No 


Hisp Public 


M 


Other 


10 


No 


Hisp Public 


F 


Hisp 


11 


No 


Hisp Public 


F 


Black 


12 


No 


Hisp Public 


F 


Other 


13 


No 


Catholic 


M 


Hisp 


14 


No 


LiaunO JLXC 


• 

M 


DiaCK 


15 


No 


Catholic 


M 


Other 


16 


No 


Catholic 


F 


Hisp 


17 


No 


Catholic 


F 


Black 


18 


No 


Catholic 


F 


Other 


19 


No 


0th Private 


M 


Other 


20 


No 


0th Private 


F 


Other 


21 


No 


0th Private 


M&F 


Hisp 


22 


No 


6th Private 


M&F 


Black 


23 


Yes 


Reg Public 


M 


Hisp 


24 


Yes 


Reg Public 


M 


Black 


25 


Yes. 


Reg Public 


M 


Other 


26 


Yes 


Reg Public 


F 


Hisp 


27 


Yes 


Reg Public 


F 


Black 


28 


Yes 


Reg Public 


F 


Other 


29 


Yss 


All Private 


M&F 


All 




TOTAL 









Sum of Sum of Nonresponse 

weights: weights: adjustment 
eligible completes 



17,799 


16,848 


1 


.0565 


35,482 


31,721 


1 


.1185 


375,958 


360,566 


1 


.0426 


14,856 


12,957 


1 


.1465 


66,612 


63,151 


1 


.0548 


445,419 


427,335 


1 


.0423 


5,516 


4,744 


1 


.1625 


800 


753 


1 


.0614 


4,848 


4,688 


1 


.0341 


0 , 3/y 


C OTA 


1 


.0328 


2,608 


2,172 


1 


2009 


5,227 


4,791 


1 


.0911 


1,924 


1,733 


1 


1104 


2,138 


2,014 


1 


0612 


53,256 


50,106 


1 


0629 


4,180 


4,064 . 


1. 


0285 


4,770 


4,264 


1. 


1186 


55,698 


53,106 


1. 


0487 


30,138 


27,855 


1. 


0820 


32,249 


31,251 


1. 


0319 


1,631 


1,631 


1. 


0000 


2,010 


2,006 


1. 


0020 


6,714 


2,517 


2. 


6673 


12,019 


8,301 


1. 


4478 


73,211 


55,113 


1. 


3283 


10,314 


4,218 


2. 


4449 


24,429 


12,872 


1. 


8978 


97 , 540 


72,065 


1. 


3535 


28,416 


22,979 


1. 


2365 


1,422,340 


1,292,191 
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Table 5-7 



Nonresponse Adjustments to Sampling Weights ' 
for Cases with At Least One Postsecondary Transcript and 
Completed Questionnaires from the Base Year, First, Second, and Third 

Follow-Up Surveys (WT2) 

Weighing Classes 



Vocational Type of Sum of Sum of Nonresponse 

postsecondary secondary weights: weights: adjustment 

only school Sex Race Eligible Completes 



1 


No 


Reg Public^ 


. M 


Hisp 


17,799 


13,930 


1 


.2777 


2 


No 


Reg Public 


M 


Black 


35,482 


27,538 


1 


.2884 


3 


No 


Reg Public 


M 


Other 


375,958 


315,076 • ' 


1 


.1931 


4. 


No 


Reg Public 


F 


Hisp 


14,856 


11,569 


1 


.2840 


5 


No 


Reg Public 


F 


Black 


66,612 


51,900 


1 


.2835 


6 


No 


Reg Public 


F 


Other 


445,519 


381,640 


1 


.1671 


7 


No 


Hisp Public 


M 


Hisp 


5,516 


3,519 


1 


.5676 


8 


No 


Hisp Public 


M 


Black 


800 


729 


1 


.0968 


9 


No 


Hisp Public 


M 


Other 


4^,848 


3,761 


1 


.2889 


10 


No 


Hi PiiHT i e* 


V 




D , J / 7 




L 


. ZD /y 


11 


No 


Hisp Public 


F 


Black 


2,608 


2,172 


1 


.2009 


12 


No 


Hisp Public 


F 


Other 


5,227 


4,733 


1 


.1044 


13 


No 


Catholic 


M 


Hisp 


1,924 


1,676 


1 


.1479 


14 


No 


Catholic 


M 


Black 


2,138 


1,696 


1 


,2600 


15 


No 


Catholic 


M 


Other 


53,256 


46,264 


1 


1511 


16 


No 


Catholic 


F 


Hisp 


4,180 


3,901 


1 


0713 


17 


No 


Catholic 


F 


Black 


4,770 


3,884 


1 


2281 


18 


No 


Catholic 


F 


Other 


55,698 


50,526 


1' 


1022 


19 


No 


0th Private 


M 


Other 


30 , 138 


22,389 


1. 


3460 


20 


No 


0th Private 


F 


Other 


32,249 


24,516 


1. 


3154 


21 


No 


0th Private 


M&F 


Hisp 


1,631 


1,251 


1. 


3044 


22 


No 


0th Private 


M&F 


Black 


2,010 


1,829 


1. 


0989 


23 


Yes 


Reg Public 


M 


Hisp 


6,714 


2,117 


3. 


1715 


24 


Yes 


Reg Public 


M 


Black 


12,019 


6,959 


1. 


7271 


25 


Yes 


Reg Public 


M 


Other 


73,211 


44,991 


1. 


6273 


26 


Yes 


Reg Public 


F 


Hisp 


10,314 


2,393 


4. 


3103 


27 


Yes 


Reg Public 


F 


Black 


24,429 


10,066 


2. 


4265 


28 


Yes 


Reg Public 


F 


Other 


97,540 


61,099 


1. 


5964 


29 


Yes 


All Private 


M&F 


All 


28,416 


20,642 


1. 


3766 




TOTAL 








1,422,340 


1,127 ,-955 







" 58 
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Table 5-8 shows the statistical properties of the raw weights and 
the two sets of adjusted weights. The table includes the mean, sum, 
variance, standard deviation, coefficient of variation, minimum, maximum, 
skewness, kurtosis, and the number of weighted cases for each weight. 
Note that each of three weights is constrained to sum to th& same 
estimated population total (1,442,340). Similarly, the sums of the three 
weights are constrained to be equal within each of the 29 weighing cells. 

5.6 Standard Errors and Design Effects 

Statistical estimates base^ upon High School and Beyond data are 
subject to sampling variability. Sampling errors arise because data are 
collected from only a randomly selected portion of the members of a 
population of interest. The HS&B sophomore cohort sample, as realized, 
is only one representation of a large number of samples of similar size 
that might have been drawn. Sampling errors are directly related to the 
underlying variability of the property being measured, and are inversely 
related to the number of observations contributing to the statistical 
estimates . 

Because the sample design for the HS&B cohorts involved 
stratification, disproportionate sampling from certain strata, and 
clustered (i.e., multi-stage) probability sampling, the calculation of 
exact standard errors for survey estimates can be difficult and 

Table 5-8 

High School and Beyond Sophomore Cohort 
Postsecondary Education Transcripts Study 
Statistical Properties of Sample Case Weights* 



Weight 


RAWWT 


WTl 


WT2 


Mean 


233 


257 


289 


Sum 


1,422,340 


1,422,340 


1,422,340 


Variance 


42,967 


53,930 


68,608 


Standard deviation 


207 


232 


262 


Coefficient of variation 89 


90 


91 


Minimum 


1 


1 


1 


Maximum 


2,219 


2,392 


3,058 


Skewness 


1.65 


1.77 


1.78 


Kurtosis 


7.1 


7.7 


8.2 


Number of cases 


6,098 


5,533 


4,930 



*NOTE: All entries except skewness and kurtosis have been rounded 
to the nearest whole number; the cpefficient of variation 
is in percentage terms. 
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exjpensive. Popular statistical analysis packages such as SPSS (Statistical 
Programs for the Social .Sciences) or SAS (Statistical Analysis System) normally 
calculate standard errors using the assumption that the data being analyzed were 
collected from simple random samples. As is described in detail in the High 
School and Beyond sample design reports for each survey wave, the HS&B sample 
design is, on balance, somewhat less efficient than simple random samples of 
equal size. Thus, sampling errors generated by SPSS and SAS will normally 
underestimate significantly the sampling variability of statistical estimates 
suchi-as population means, percentages, -and more complex statistics such as 
correlation and regression coefficients. 

Several procedures are available for calculating precise estimates of 
sampling error for complex samples. Procedures such as Taylor series 
approximations. Balanced Repeated Replication (BRR) , and Jackknife Repeated 
Replication (JRR) vary somewhat in computational convenience and cost, and in 
their ability to account for several sources of sampling variability, most 
notably clustered selection of sample cases. 

After each survey wave since the base year, sampling variances have been^ 
calculated for about thirty estimated proportions or me&ns for the whole sample 
and for several subgroups (domains), and have been reported in the data file 
users' manuals for each public release tape. In general, these calculations have 
been carried out using BRR. However, comparisons of variance estimates provided 
by Taylor series and BRR carried out at the time of the HS&B first follow-up 
survey showed little difference in the resulting error estimates for such 
statistics as means, proportions, and Pearson correlation coefficients. 

In addition to standard errors, the design effects for each estimate (DEFF) 
and the square roots of each design effect* (DEFT) were calculated and reported. 
The de$igh effect is admeasure of the inefficiency of the* sample estimate 
relative to a simple random sample of equal size. It is defined as the ratio of 
the actual variance of an estimate (i.e., the square of the standard error) to 
the variance of the same estimate from a simple random sample with the same 
number of cases. For proportions, the estimated simple random sample variance is 
just 

VAR(SRS) - p(l . p)/n (1) 

in which 

p - the estimated proportion 

and 

n - the number of cases with non-missing data 

Like almost all national samples, the High School and Beyond sample is not a 
simple random sample. The High School and Beyond sample departs from the model 
of simple random sampling in three major respects: the observations are 
clustered at the school level; major groups (such as students who attended 
private schools) are deliberately represented disproportionately; and the sample 
is stratified by type of school. Each of these departures from simple random 
sampling has an effect on efficiency, which is reflected in the design effect. 
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Separate sampling errors and design effects have not been, calculated for the 
postsecondary transcript data. The calculations of sampling errors and design 
effects performed for the Hiph School and Bevond 1980 Sophomore Cohort Second 
Follow>Up (1984) Data Fil e Users Manual have been reproduced and included in 
Table 5-9. 

The mean design effects given in Table 5-9 can be used to calculate 
approximate standard errors for estimates based upon transcript data. For 
example, the standard error of a proportion cai\ be estimated using the square 
root of the expression in (1) (above) times the mean root design effect (DEFT): 

SE - DEFT (p[l-p]/n)l/2 (2) 



With the exception of tiiose for Hispanics, the DEFTs in Table 5-6 for 
subgroups are generally 10 percent smaller than that for the total population. 
The relative efficiency of the Hispanic subsample continues to 6e affected by the 
somewhat larger follow-up cluster sizes for Hispanic sample members in specific . 
schools and relatively few geographical areas, and higher variability in sample 
weights because some Hispanics (those in so-called "Hispanic schools") were 
sampled at very high rates while others (in regular public schools) were sampled 
at rater, closer to those of majority whites. Furthermore, the variability of the 
DEPrs for Hispanics is over twice that observed for most other subgroups. Thus, 
for analysis of data from Hispanics, the use of a single generalized design 
effect to inflate simple random sample estimates of sampling errors involves a 
larger degree of DEFTs for approximation. Nevertheless,' the differ.ences between 
Hispanics and other groups remain generally srall. Researchers who use design 
effect factors to estimate standard errors for Hispanic sample data and who 
prefer to be statistically conservative may wish to choose a design effect 
slightly larger than the mean of 1.48 in Table 5-9. 

In addition. Table 5-10 presents selected distributional statistics for the 
DEFF and DEFT factors for proportions taken from prior survey waves. These 
tables as well as several informal analyses carried out at NORC and at NCES, 
generally confirm that, with minor exceptions noted, the design effects have 
remained reasonably constant across survey waves and population domains, and show 
relatively small variability across survey items within waves and domains. 
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Table 5-9 

Dfstrfbutfonai Statistics for -Design Effects and Root Design 
Effects for 30 Survey Measures for 12 Domains 



Domain 






DEFF. 


DEFT 


Total population 


Mean 




2.19 


1.48 




Minimum 




1.40 


1.18 




Maximum 




2.68 


1.64 




Standard 


deviation 


0.29 


0\10 


Hispanic 


Mean 




3.11 


1.75 




Minimum 




1-69 


1.30 


* 


Maximum 






2.32 




Standard 


deviation 


0.76 


0.21 


Black 


Kean 




2.19 


1.47 




Minimum 




1.24 


1.11 




Maximum 




2.92 


1.71 




Standard 


deviation 


0.36 


0.13 


Whites and others 


Mean 




1.92 


1.38 




Minimum 




1.32 


1.15 




Maximum 




2.38 


1.54 





Standard 


deviat ion 


0.23 


0.08 


.Kema I e 


Mftan 




2.06 


1.43 




Minimum 




1.51 


1.23 




Maximum 




2.42 


1.55 




Standard 


deviat ion 


0.21 


0.07 


Male 


Mean 




2.07 


1.44 




Minimum 




1.37 


1.17 




'Maximum 




2.59 


1.61 


, 


Standard 


deviation 


0 .24 


0.09 


Lowest quartile SES 


Mean • 




1.83 


1.3S 




Minimum 




1.22 


1.10 




Maximum 




2.31 


1.5? 




diBnaara 


□ev 1 a ( 1 on 


U mCO 


n in 


Middle quartiles SES 


Mean 




2.06 


1.43 




Minimum 




1.43 


1.20 




Maximum 




2.41 


1.55 




Qianaara 


deviat ion 


n 

U .£!> 


0 . UV 


Highest quartile SES 


Mean 




1.92 


1.38 




Minimum 




1.31 


lCl4 




Maximum 




2.48 


1.57 




Standard 


deviation 


0.£O 


0.10 


Received no PSE 


Mean 




1.98 


1.40 




Minimum 




1.25 


1.12 




Maximum 




2.82 


1.68 




Standard 


deviation 


0.34 


0.12 


Received some PSE 


Mean 




2.09 


1.44 




Minimum 




1.46 


1.21 




Maximum 




2.53 


1.59 




Standard deviation 


0.19 


0.07 


Four-year degree 


Mean 




1.63 


1.26 




Minimum 




0.16 


0.39 




Maximum 




2.14 


1.46 




Standard 


deviation 


0.42 


0.21 
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Table 5-10 



Distributional Statistics for Design Effects and Root Design Effects 
for Proportions from Various Survey Waves 
HS&B Sophomore Cohort 



Survey DEFF DEFT 



First Follow-Up, using 
First Follow-Up Weight 

Mean • ' 3.14 1.72 

Minimum 1.33 1.15 

Maximum 7.41 2.72 

Standard deviation 1.80 0.47 

Changes in Proportions between 
BY and FFU, using FFU Weight 

Mean • 1.80 1.33 

Minimum .95 .98 

Maximum 3.45 1.86 

Standard deviation .61 .21 

Second Follow-Up, using 
Second Follow-Up Weight 

Mean 2.40 1.54 

Minimum 1.23 1.11 

Maximum 4.00 2.00 

Standard deviation 0.56 0.18 
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NOTES TO CHAPTER 5 



•"•For further details on the base year sample design see Martin R. Frankel, 
Luane Kohnke, David Buonanno, and Roger Tourangeau, Sample Design Report 
(Chicago: NORC, 1981). 

^The sampling frame, defined as the universe of high schools in the 
United States, was obtained from the 1978 list of U.S. elementary and secondary 
schools of the Curriculum Information Center, a private firm. This was 
supplemented by the NCES lists of public and private elementary and secondary 
schools. Any school listed in any of these files that contained a tenth ga.ade, a 
twelfth grade, pr both was made part of the frame. 

•'Apart from substitution for schools that refused, there were a number of 
schools in the originally- drawn sample that were "out of scope," that is, they 
failed to fit the criteria for inclusion in the sample. The sample was then 
augmented through selection of an additional school for each out-of-scope school, 
within major strata. Most of the out-of-scope schools were area vocational 
schools having no enrollment of their own, although they were listed in the frame 
as having enrollments. 
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> V 



Appendix A: List of Endorsing Institutions 

Contents of School Transcript Request Packaf;es 
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NATIONAL LONGITUDINAL STUDIES PROGRAM 

A M K^S*^ ^^'^^^^ Beyond 

A National Longitudinal Study for the 1980's 



Spontoftd by tht Ctnttr for Eduettien Statistics, 
US. Dtptrtmtnt of Education 



Tht proftsslonal organizations tisttd iMiow fully •ndoria 
thtir mambtn to cooptnta in this important pro|act 



Amtrlcan Association of Cpllagiata Ragittrars and Admissions Officars (AACRAO) 

Amarican Association of Community and Junior Collagas (AAaC) 

Amarican Association of Stata Coilogts and Univarsitits (AASCU) 

Amarican Council on Education (ACE) 

Association of Catholic Colltgaa and Unlvtnltlta (ACCU) 

Association of Indapandant Collsgaa and Schoola (AlCS) 

Association of Jaault Collagas and Unlvtrsltiaa (AJCU) 

Tha Celltgt Board 

National Aecraditing. Commission of Cosmatology Arts and Sciancaa (NACCAS) 
National Association of Colltga and Univarslty Businass Offlsafi (NACUBO) 
National Association for Equal Opportunity In HIghar Education (NAFEO) 
National Association of Studant Financial Aid Admlnlatrators (NASFAA) 
National Association of Trada and Tachnlcal Schools (NATO) 
National Council 61 HIghar Education Loan Programs (NCHELP) 
National instituta of indapandant Collagas and Univarsitias (NIICU) 
Unltad Negro Collage Fund, Inc 



UNITED STATES DEPARTMENT OF EDUCATION 

FO» rm,r5J.2!^"^ ASSISTANT SECRETARY 
FOR EDUCATIONAL RESEARCH AND IMPROVEMENT 



CENTER rOR EDUCATION STATISTICS 



Dear Registrars and Officials: 

o.S«!ili?w^'°«*"*?^"?* proiram. thtCenter for Education 

Statutics has been collecting transcript and other information for 
persons who have participated ia it's surveys. To continue this effort, 
he Center has autiionzed the National Opinion Research Center (NORO 
J*^***?- "»"«'^P« ««iividuaU who arc participatiaf in 

the Hifh School and Beyond (HS*B) survey. The goal of tiiis study is to 

«f?'»*V*"l!"*^''*^ ^ »ttre|ated to examine research Lsu« at 
tnc national leveL Educauoa researchers and policy analysts wUl relate 
the information about courses uken and credits earned to the characteris- 
tics fathered from questionnaires and other sources. HSAB will enable 
researchers to analyze the relationships between coursetakiag patterns, 
academic achievement, and subsequent occupational choices and success. 
Student names are used only to make sure that datt on variables from 
different sources (tests, questionnaires, and transcripo) refer to the 
same individuals and not to find out anything about particular 
individuals. 

The grant of aut.'^ '-ity for collection of tiie transcript data is made 
pursuant to the prwvision in the Family Education Rights and Privacy 
Act (FERPA) (20 UJ5.C 1232g), implemented by 34 CFR 99.31(aX6), that 
allows the release of records to the Seoreury of Education or to hii agent 
without the prior consent of the survey participants. The privacy of 
the infojrmation you are^uked to supply to NORC will be protected, as 
required by FERPA. A copy of the relevant section of the act is 
reproduced on the reverse side of this page. 

We would appreciate your cooperation wish NORC in the transcript study. 

Sincerely yours, 



EmeHoo J. Elliott 
Director 
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hi parotroph tol of Ihto ttciloT 
mojicittiffaiitNaii 



aWt WarMtton frM Iht aducatlun 
rtcardt al a atadtnl aniy an tha candl* 
Uan that lha parly to wham the tofar- 
jnallan to ditclattd mm nal dtoclatt 

wllhani Iht priar vrtlton canttnl af 
Iht partnt of Iht ttndtnt or tht t|lal 

^ ^Sf'Il^ •"••^ Ptrwinaily 
Idtmiflahlt tofarMllon nhkh to dto- 
clattd to an tottltutlan. aatncy ar ar- 
tanbollon My ho Mtd hy lit af f letm. 
tmployttt and aatnU. toil anIy for Iht 
»* w y acct for which tho dtoclotura wot 

Ihl raratiaph lal af Ihto aadlan 
datt Ml prtctodt an aatncy ar hittUa. 
IlT/'?? ^•^'•••^ peraontlly idtnlC- 
flobit tolaraitllon nndtr i tt.li vfth 
Iht undtrMandlna Ihtl tht Inlorat* 
Uon wttI ba rtdltcloacd to olhtr ptriirt 
nndtr Ihtl ttcllon; rrpHd«< Thtl Iht 
rtcordkttplha rtaalrtmtnu af | at Jf 
art Mtl vJth ratptcl to ttch af IhM 
parlltt. 
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BEST COPY AVAILABLE 




Ctsur 



June 1987 



Otar Rctiscrar: 



NORC, « social seitnec research eenctr ac chc Univtrsicy of Chicato 
rcqucscs your assiscanct in ch« conduce of a Post secondary Educacio^ 

lAZiW ^''"J^' '^"^ y'""' ''•^P colLccing cranscJipcs for i 
SMpic Of studencs who arc parcicipacing in chc National Longitudinal 
fSJ?'" i?"^ .ponsorcd by the Center for Education sllTsTic. 

wT • Py^PO" «f Che transcript study, a component of NL5, is to 

obtain reliable and objective information about the types and p«ternS 
of courses taken by students. The data will make it possible f" 
researchers to relate course-caking patterns to student characteristics 
:;^^^:J^' .-.cionnaire files, and to subsequent oc^Ji'tU^l 



al«if ISt? 5 I-?««iC"1i°«l Study.of the High School 

I 1^]^'^.^^ ""^-'^ School and Beyond (HS*8), the latter 

conducted by NORC since 197V m.S-72 and HSiB constitute a larJe-Icifl 
longitudinal study of the high school classes of 1972 1980 anJ 1M2 
Mationally representative s«.ples of the cUss of ml 
resurveyed five times .inc. graduation, and (ttie classes of 198? and 1982 

clas. of 1982 have reported attending about 2,100 postsecondary schools. 

W« would like to obtain the transcripts of one or more samolc «.«b*r« „h« 
reported attending your school. Specifically .e ere Jeq^J JnnSSt;- 
copres of transcripts for each individual naied on the 'e^cJosS c'Sjcklist 

!o!lf ""''•"^ " her attendaJ«. ;« 

would also appreciate it if you could provide us with: 1) ^"iy of the 
•chool's course catalog and 2) an interpre.ition of your grldinJsJst^ 

I? r-"li"te accurate and unifo^ coding oHJe Htl * Si 
folder contain, more information about the, study Jnd o^J rjjieic Tor 
data. You will also find materials concerning applicable fed.r.I 
reguUtxons and endorsements by professional org«iz"ioJ.! 

Privacy and cm:.fidentialicy are always of concern to institutions .nd 
offices that maintain student records. CES and the orJ«U« o« ulLr 
contract to it adhere to the highest standards in protJS nJ Se oJiJlcv 
of individuals involved -in the research it wndertaJes "iofooSiatl 
"....ures are employed to ensure the confidentiality of risSIrch 
perticipants during xhe collectibn. analy.i.. and reportfJg of all survey 
<!«.. Of course, all relevant safeguards will be applied fo hJ study! 
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t«a»ir.q»oe.. "cquirKi .rt in Ictpinf vieh u. fOlPA 



^i,:xT:rp.:::-.^^^^ ~ i-- 

(312) ™3!«Mr; ',f^"?'*" '"j'" ""'"or, Tr«,.eripc Stud, It 
mil Tom 'ytt^llMldlil]? '"J*" ^-.cripc 



Sinctrcly, 

Barbara K. Cupbtll 
High School «nd Beyond 
Projecc Dirtccor 




Tramseripc Scudy 
Projecc Director 



AMERICAN ASSOCIATION of COUECIATE REGISTRARS and ADMISSIONS Or,,rJ 

^^^^ .O. 0^ N.W,. ,„ . w«M^„. 0. c. »03. > aa^^X 




nOnMCOftMTTH 
jSnTcOUJMI, A 



JiJW 15t 1987 




tan A0>iNu«nt M «i^«iTv 
(lOS) Hum? 



0»«r Coll 



Lhiwity of Chicago. «ci«nc» rcMarch Mntw- «t tr» 

6.00o'^t^''':i^'^" tr«^ript data for .t^^t 

Th» information obtain^ in thi« «tu9v will m^. , v.«, ^, 
ccntriix.tiw to wucaticrai poUcy r«Mr^n « JT'^, 
of postascoidary «tL-i«« 1- ^ r^Utionsnxo 



OA«y SmiTn 



Sinc»r»ly, 





Jofn P. Collirm, 
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- ANNUM Nii^'NC^ 



MORC 

Center For Ediication Statistics ^ 
National Longitudinal Studies Program 
High School and Reyond 

CESoS Longitudinal Studies Program 

Tbe mandate of the Center for education Statistics (CBS) of the 
\J.S. Department of Education includes the responsibility to "collect 
and disseminate statistics and other data related, to education in the 
United States*V«nd to ''conduct and publish reports on specific ai^alyses 
of the meaning and significance of such statistics** (Education Amendments 
of 19V4 • Public Uv 93*380, Title V, Section 501, amending Part A of 
the General Education Provisions Act)* 

Consistent vith this iMndate and in response to the need for 
policy*relevantt time«*series data on a nationally representative sample 
of high school students, CES instituted the Nationai Longitudinal 
Studies (NLS) program, a continuing long*-term project* The general aim 
of thc NLS program is to study longitudinally the educational, vocational, 
and personal development of high school students and the personal, 
familial, social, institutional, and cultural factors that may affect 
that development* 

The NLS program was planned to make use of time*-series databases in 
two vays: (1) each cohort is surveyed at regular intervals over a span of 
yeMTt^ and (2) comparable data is obtained from successive cohorts, 
permitting studies of trends relevant to educational and career 'development 
and societal roles* The NLS program consists of tvo major studies; 
The National Longitudinal Study of the High School Class of 1972 (NLS-72) 
and High School and Beyond (HS&B)* 



High School and)/Beyond 

High School and Beyond (US&B) is £ longitudinal study ^f the critical 
transition years as high school students leave the secondary school system 
to begin posts€icondary eclucation, vork, and family foraation* Its purpose 
is to provide information on the characteristics, achievements, and plans 
of high school students, their progress through high school, and the 
transition they make from high school to adult roles* Because of the 
breadth of the survey's coverage, data can be used to examine such policy 
issues as school effects, bilingual education, dropouts, vocational 
education, academic growth, access to postsecondavy education^ student 
financial aid, and life goals* High School, and Beyond was designed to 
collect data that would be comparable to that of the National Longitudinal 
Study of the High School Class of ;i972 (NLS-72)* 

In 1980, a national sample of over 30,000 sophomores and 28,000 
seniors enrolled in 1,015 public and private schools participated in the 
Base Year Survey* During this stage of the study, students completed a 
cognitive test and a questionnaire about their high school esperiences and 
plans .for the future* In order to find out how plans have worked out or 
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changed, tubsamples of the bate-year students were asked to cooplete 
folloir^up questionnaires^ in 19829 1934 and 1986. The 1980 so^oaore class 
also completed a cognitive test in 1982 when they were seniors. In 
addition, base^year data were compiled from such sources as school 
a(rif^linistratorS9 teachers, students* administrative records (transcripts), 
and parents of selected students. 

In the spring of 1984 a consortium of university research centers 
sponsored a' itudy of principals; guidance, vocational, and community 
service program counselors; a.)d up to 30 teachers in each one of a sample 
of approximately 500 HS&B scho^iis. Besiilts of this survey, funded by the 
National Institute of Education, have become part of the HS(B database and 
permit researchers to describe the impact of the school environment on the 
educational "process • 

Postsecondary transcripts were collected for the senior cohor^> of 
HS&B in 1984. They contain reliable and objective information about the 
types and patterns of courses taken by students in colleges, graduate 
schools, and non-collegiate postsecondary institutions. The information 
has been merged with the expanding HS&B database. It will be possible for 
researchers to relate courses-taking patterns to student characteristics 
available in the student questionnaire data files an^ to subsequent 
occupational choice and success. 

A Financial Aid Records Study was conducted in 1985 for the senior 
cohort. Postsecondary schools attended by US4B students provided data on 
the students* costs of attendance, student wd family contributions, and 
financial aid packages. Guaranteed Student Loan records and Pell Crant 
information were collected from central daca bases maintained in the 
Office of Education. Data from the three sources were then merged to 
provide a comprehensive profile of financial assistance. 

Currently, records are being requested of Guaranteed Student Loans 
and Pell Crants that US&B sophomores may have obtained. This financial 
aid information will be available to complement the postsecondary education 
transcripts. Hence, for the 1980 sophomore class, the Department of 
Education will have a complete record of high school experiences and past 
high school activities, including postsecondary schooling and financing. 

survey of the 1980 sophomore cohort's postsecondary transcripts • 
is ali:> underway. Some 2,100 postsecondary institutions are being asked 
to participate in this study. Like that of the senior cohort, the 
sophomore transcript study will provide information concerning the types 
and patterns of courser taken by students and will allow researchers to 
relate course-taking patterns to student characteristics available in the 
stuclent questionnaire data files, 2nd to subsequent occupational choice 
and success. 
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NATIONAL kONGITUDINAL STUDIES PROGRAM 
High School and Btyond 
A National Longitudinal Study for the 1980's 



msTEucrioiis 



r;n!n!«^'rr ^''«""'»«»«y Eduction Tr«.cript Study involv., 

Chicago. Th. ,t.p. on tto. following p.,.. provid. d.t.uroJl^ 



— whojt tr«n«cfiptj art r«qutsctd 

— which school publications arc requtsccd 

to rttura MCtriaU to NORC 
hov CO bt rtimburstd by liOBC 



Sttp 1: Eeviev scudenc checklisc 



The student checklist provides the names, in alphabetical order, of the 
students for whom copies of the transcript are being requested. In 
addition, other names (e.g., maiden.^ family, alternate spelling, etc.). 
social security numbers, and birthdates are provided as additional 
Identifying information for many students. Please enter a check if you 
are enclosing a transcript(s) for a.-student. If you are unable to 
provide some or any of records for a student, please enter the reason 
in the space provided. 

EXAMPLES: 

"Never attended this school" 

"Transcripts cannot be located at this time" 

"Did not attend long enough to earn credit" 

Two copies: of the student checklist have been enclosed. Please return 
one copy with your checkmarks and any coonents with the transcripts. 
The other copy is for your- school's records* 

Step 2: Retrieve aod prepare traoacriptj 

Locate and prepare (e.g., photocopy, generate a computer printout, 
etc.) a copy of each transcript for each student on the checklist. 

Step 3: Label the transcripts 

Affix the enclosed student labels to the back of the appropriate 
transcripts . 

Step 4: Insert disclosure notices in each student's record file 

Disclosure notices indicating the purpose for which student records 
were accessed for the transcript study are enclosed for your 
convenience. 

Step S: Obtain course catalog{s) or course Iist(s} 

Obtain course catalog(s) or course list(s) describing the courses 
offered by your institution. Catalogs should be included for al]. 
programs and schools for which the student has been enrolled (e.^., the 
liberal arcs college AND the law school). PUase indicate on the 
checklist whether Che current catalog(s) or course list($) has been 
included in :he package for return to NORC. 



Se«p 6: Obcaia trading systtm dtscripcion 

Obc*in * copy of your school's official dtscripcion of ics ar*din. 
l^i^^ '"'r .CudeS^ptrforiinc; , 

ilnorl "'^".f' ""-l«cer grading (..g., P„s, High-pl l* 

!f J^rL'^II • • ^" internets, chis would tnc*il CrinsUcion 

of gr.de designations co verbal (e.g., « "a" - ("Oucscandina i^Jr!" 
or quwcxcacive U.g., •>?'. "95-100") definitions. * 

Seep 7: for reiaburscaenc of expenses 

.!L^J™"it„r'''* '•'^'"''"'•?«<l Che photocopying required for 

r«uJn • ^•l*ced expenses, pleiie complete «,d 

return ell copxes of the enclosed voucher with the transcripts On. 

Transcript Study Project Director (collect) at (312) Toz-sViO. * ' 
Seep 8: Aaaeable and send transcripts to NOBC 

A pre-paid, self-addressed envelope is enclosed for returning the ' 
cranscrxpts and other related materials. returnxng the 

Please return all transcript study materials by August 1. If you 
encounter problems of any kind in, regar^d to our reJulJn or ^ 
tr.nscrxpt,, or you are unable to mail th«n by AuJSsJ or shortly 
chereafcer. please call Marcia Turner (collect) at (3121 7orJy5z 
Shirley Knight (collect) at (312) 702-8950 or 
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IIS8 SO <>€/87 



10- 0078311 



INSf: SIINV At SIDNV BK MAIN CAM 



POSISCC()flDAf?V (DUCAffON IRANSCRlPf SfUOV 
II>IICAIM)NA| lOMGlUmiNAl SUIOICS rROORAM: 
illgli ScIm)oI aiHl Beyond 

NORC 44 14 



siimiNi cnccKiist' ibanscripis rcoucsicd roR i siuocnis 



• PICASC CIICCK BOX ir COURSE CAlALOGS OR 

• LIS! «ANO A DCSCRlPflON OF IHC SCHOOL'S 

• GRADING SVSIIM ARC CNCLOSCO 

• --4 

• Course catalog or current course i I 

• list I--! 

• Description of '^hU^Q system | | 

• • - - • 



IF NO! FNCLOSFO, 
FNIFR RFASON ^e.g. 
not evel table* non- 
eMiatent, etc.) 



INSIRIICI IONS: 

Plense seiHl transcripts or equivalent forwis used for student prograra/perrormance for the students I leted below 
See Instruct Ion folder for step-by-step details. 

If you are unable to provide any student *s t&^anscrlpt, please state the reason for ea> , such student In the space 

provided. ^ 

Return this checklist with transcripts after having affixed the corresponding label on each. 
Retain the second copy of the checklist fcr your records. 



these sttidents reported attending your scIkioI between 1982 - 1986 



NtlRC use ONLY 




• 




ALIERNAIE NAME 
(•.g. maiden, fawlly. 
• Ilarnal* apalllMoH 


SOCIAL 


• 




IRANSCRIPI ENCLOSED: 
entor X In box. 
NO IRANS ENCIOSED: 


NUM 1 


CASE ID 


lOISP 




SIUOENI'S NAME 




SECURIIV 
NUMOER 




BIRIIOAIE 


enter reason (e.g. , 
"never ettended"). 


??.'.! J 




! 


Lewis, 


llernan 


--! !llJ-22-123/.! 07/00/62 


1 1 


007 1 




1 1 






1 


1 


1 


1 I 


()03 1 




1 i 






1 


1 


i 1 


1 


004? 1 




1 ! 






1 


1 




i 1 


1 


005 1 




1 i 






1 


1 




1 




1 


006 1 




1 1 






1 


1 




1 




i • 






1 ' 1 






1 


1 




1 


1 


\ ■ • 


OO0 1 

• - • 




i 1 






1 


1 




1 


1 


1 
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IMPORIANI: PIEASE CHECK REVERSE SIDE FOR ADOlllONAL NAMES. 



Appendix B: 



Course Subject Codes in Numerical Order 
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CODING SYSTEM FOR COURSE AND PROGRAM OF STUDY CODING 
FOR HSfieB SOPHOMORE TRANSCRIPT SURVEY 



PROGRAM/ 

COURSE 

CODE 



CIP 
CODE 



TITLE 



01 
02 
03 
04 
OS 
06 
07 
08 
09 
10 



11 
12 
13 
14 
IS 
16 
17 
18 
19 
20 
21 
22 
23 
24 
2S 
26 
27 
28 
29 
30 
31 
32 
33 
34 
3S 
36 
37 
38 
39 
40 
41 
42 
43 
44 



OIXXXX 
02XXXX 
03XXXX 
04XXXX 
05XXXX 
06XXXX 
0602XX 
0603XX 
07XXXX 
0706XX 



08XXXX 
09XXVX 
0904XX 
lOXXXX 
IIXXXX 
1102XX 
1103XX 
12XXXX 
13XXXX 
131201 
131202 
131203 
131204 
131205 
14XXXX 
1408XX 
141001 
1419XX 
15XXXX 
16XXXX 
160501 
160901 
160905 
17XXXX 
170605 
18XXXX 
1811XX 
19XXXX 
20XXXX 
220101 
23XXXX 
230401 
230701 
230801 



AGRIBUSINES,- „ AGRICULTURAL PRODUCTION 

AGRICULTlikilL SCIENCES 
'RENEWABLE; NATURAL RESOURCES 
■ARCHITECTURE & ENVIRONMENTAL DESIGN 

AREA & ETHlhc STUDIES 

BUSINESS' &S.MANAGEMENT 

ACCOUNTING! 

BANKING ^. FINANCE 

BUSINESS 61x OFFICE 

S ECRETARI At & RELATED PROGRAMS (Note- -this category 
does not include- typing and general office, which are 
in 09 (above) 

MARKETitTG & DISTRIBUTION 

COMMUNICATIONS 

JOURNALISM 

COMMUNICATIONS TECHNOLOGIES 
COMPUTER & INFORMATION SCIENCES 
COMPUTER PROGRAMMING 
DATA PROCESSING 

CONSUMER, PERSONAL & MISCELLANEOUS SERVICES 
EDUCATION 

ADULT & CONTINUING EDUCATION 
ELEMENTARY EDUCATION 
JUNIOR HIGH EDUCATION 
PRE -ELEMENTARY EDUCATION 
SECONDARY EDUCATION 
ENGINEERING 
CIVIL ENGINEERING 

ELECTRICAL, ELECTRONICS & COMMUNICATIONS ENGINEERING 
MECHANICAL ENGINEERING ^ ' 

ENGINEERING & ENGINEERING RELATED TECHNOLOGIES 

FOREIGN LANGUAGES 

GERMAN 

FRENCH 

SPANISH 

ALLIED HEALTH 

PRACTICAL NURSING 

HEALTH SCIENCES 

NURSING 

HOME ECONOMICS 
VOCATIONAL HOME ECONOMICS 
LAW 

LETTERS 
COMPOSITION 
AMERICAN LITERATURE 
ENGLISH LITERATURE 



ERIC 
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45 
46 
47 
48 
49 

50 
51 



52 
53 
54 
55 
56 
57 
58 
59 
60 
61 
62 
63 
64 
65 
66 
67 
68 
69 
70 
71 
72 
73 
74 
75 
76 
77 
78 

95 
96 



25XXXX 
26XXXX 
27XXXX 
279999 
28XXXX 

31XXXX 
32XXXX 



38XXXX 
39XXXX 
40XXXX 
4005XX. 
400601 
4008XX 
41XXXX 
42XXXX 
43XXXX 
44XXXX 
4407XX 
45XXXX 
4502XX 
4506XX 
4507XX 
4508XX 
4510XX' 
4511XX 
46XXXX 
47XXXX 
48XXXX 
49XXXX 
50XXXX 
5003XX 
5P07XX 
5009XX 
24XXXX 

999995 
XXXXXX 



LIBRARY & ARCHIVAL SCIENCES 
LIFE SCIENCES " 
MATHEMATICS 
CALCULUS. 

MILITARY SCIENCES (includes 29XXXX- -Military 

Technologies V 
PARKS & RECRi.nON 

FUNCTIONAL SKILLS (includes* 32XXXX - 37XXXX: Basic 
Skills, Citizenship/Civic Activities, Health-Related 
Activities, Interpersonal Skills, Leisure and 
Recreational Activities, Personal Awareness) 

PHILOSOPHY & RELIGION 

THEOLOGY 

PHYSICAL SCIENCES 
CHEMISTRY 
GEOLOGY 
PHYSICS 

SCIENCE TECHNOLOGIES 
PSYCHOLOGY 
PROTECTIVE SERVIfr-; 
PUBLIC AFFAIRS 

SOCIAL WORK (includes Medical Social Work) 

SOCIAL SCIENCES 

ANTHROPOLOGY 

ECONOMICS 

GEOGRAPHY 

HISTORY . 

POLITICAL SCIENCE & GOVERNMENT 
SOCIOLOGY 

CONSTRUCTION TRAEES 
MECHANICS & REPAIRERS 

PRECISION PRODUCTION includes 21XXXX- -Industrial Arts) 

TRANSPORTATION & MATERIAL MOVING 

VISUAL & PERFORMING ARTS 

DANCE 

FINE ARTS 

MUSIC 

LIBERAL/GENERAL STUDIES (includes 30XXXX- -Multi/ 

Interdiscipline studies) 
UNCGDEABLE 
TRANSFER COURSES 
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xmxx 



MISSING 
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